{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "How to Create an Automatic Code Comment Generator using Deep Learning.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "machine_shape": "hm"
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BP-nRxOCnfZz",
        "colab_type": "text"
      },
      "source": [
        "# How to Create an Automatic Code Comment Generator using Deep Learning!\n",
        "> A tutorial for automatically generate code comments using Deep Learning.\n",
        "\n",
        "- toc: true \n",
        "- badges: true\n",
        "- comments: true\n",
        "- categories: [deep_learning, software_engineering]\n",
        "- image: images/comment_gen_preview.png"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TXumG5Ymn1u1",
        "colab_type": "text"
      },
      "source": [
        "# About"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vDlYPJ0s6gKZ",
        "colab_type": "text"
      },
      "source": [
        "In this post, you will be create a Deep Learning model that given a piece of code, will automatically generate a comment describing (hopefully 🤞) what the piece of code does. This post will focus on Java code, however, the same approach should be able to be applied to other programming languages such as Python or Javascript."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PIbOjaSlU4HE",
        "colab_type": "code",
        "outputId": "a8b28127-2916-45b9-b3a5-35023152f87e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "#hide\n",
        "# Dependencies\n",
        "! pip install fastai nbdev sentencepiece"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Requirement already satisfied: fastai in /usr/local/lib/python3.6/dist-packages (1.0.60)\n",
            "Collecting nbdev\n",
            "\u001b[?25l  Downloading https://files.pythonhosted.org/packages/34/a5/00807d046fe451bc5c2756fea055f3c24ab02b7c52912ee44a52da320a7d/nbdev-0.2.13-py3-none-any.whl (44kB)\n",
            "\u001b[K     |████████████████████████████████| 51kB 1.7MB/s \n",
            "\u001b[?25hCollecting sentencepiece\n",
            "\u001b[?25l  Downloading https://files.pythonhosted.org/packages/74/f4/2d5214cbf13d06e7cb2c20d84115ca25b53ea76fa1f0ade0e3c9749de214/sentencepiece-0.1.85-cp36-cp36m-manylinux1_x86_64.whl (1.0MB)\n",
            "\u001b[K     |████████████████████████████████| 1.0MB 8.0MB/s \n",
            "\u001b[?25hRequirement already satisfied: packaging in /usr/local/lib/python3.6/dist-packages (from fastai) (20.1)\n",
            "Requirement already satisfied: fastprogress>=0.2.1 in /usr/local/lib/python3.6/dist-packages (from fastai) (0.2.2)\n",
            "Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from fastai) (1.4.1)\n",
            "Requirement already satisfied: pyyaml in /usr/local/lib/python3.6/dist-packages (from fastai) (3.13)\n",
            "Requirement already satisfied: torchvision in /usr/local/lib/python3.6/dist-packages (from fastai) (0.5.0)\n",
            "Requirement already satisfied: numexpr in /usr/local/lib/python3.6/dist-packages (from fastai) (2.7.1)\n",
            "Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.6/dist-packages (from fastai) (4.6.3)\n",
            "Requirement already satisfied: nvidia-ml-py3 in /usr/local/lib/python3.6/dist-packages (from fastai) (7.352.0)\n",
            "Requirement already satisfied: spacy>=2.0.18 in /usr/local/lib/python3.6/dist-packages (from fastai) (2.1.9)\n",
            "Requirement already satisfied: Pillow in /usr/local/lib/python3.6/dist-packages (from fastai) (6.2.2)\n",
            "Requirement already satisfied: torch>=1.0.0 in /usr/local/lib/python3.6/dist-packages (from fastai) (1.4.0)\n",
            "Requirement already satisfied: dataclasses; python_version < \"3.7\" in /usr/local/lib/python3.6/dist-packages (from fastai) (0.7)\n",
            "Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.6/dist-packages (from fastai) (1.17.5)\n",
            "Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from fastai) (2.21.0)\n",
            "Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from fastai) (0.25.3)\n",
            "Requirement already satisfied: bottleneck in /usr/local/lib/python3.6/dist-packages (from fastai) (1.3.1)\n",
            "Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (from fastai) (3.1.3)\n",
            "Requirement already satisfied: nbformat>=4.4.0 in /usr/local/lib/python3.6/dist-packages (from nbdev) (5.0.4)\n",
            "Collecting fastscript\n",
            "  Downloading https://files.pythonhosted.org/packages/55/0e/ecdc0213646bc82986884121109a38b50bbc2cd2c491bbbfdc7ae39228e3/fastscript-0.1.4-py3-none-any.whl\n",
            "Requirement already satisfied: nbconvert>=5.6.1 in /usr/local/lib/python3.6/dist-packages (from nbdev) (5.6.1)\n",
            "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from packaging->fastai) (1.12.0)\n",
            "Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from packaging->fastai) (2.4.6)\n",
            "Requirement already satisfied: blis<0.3.0,>=0.2.2 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (0.2.4)\n",
            "Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (2.0.3)\n",
            "Requirement already satisfied: wasabi<1.1.0,>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (0.6.0)\n",
            "Requirement already satisfied: preshed<2.1.0,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (2.0.1)\n",
            "Requirement already satisfied: plac<1.0.0,>=0.9.6 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (0.9.6)\n",
            "Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (1.0.2)\n",
            "Requirement already satisfied: srsly<1.1.0,>=0.0.6 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (1.0.1)\n",
            "Requirement already satisfied: thinc<7.1.0,>=7.0.8 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.0.18->fastai) (7.0.8)\n",
            "Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->fastai) (2.8)\n",
            "Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->fastai) (3.0.4)\n",
            "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->fastai) (2019.11.28)\n",
            "Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->fastai) (1.24.3)\n",
            "Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.6/dist-packages (from pandas->fastai) (2018.9)\n",
            "Requirement already satisfied: python-dateutil>=2.6.1 in /usr/local/lib/python3.6/dist-packages (from pandas->fastai) (2.6.1)\n",
            "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->fastai) (1.1.0)\n",
            "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib->fastai) (0.10.0)\n",
            "Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /usr/local/lib/python3.6/dist-packages (from nbformat>=4.4.0->nbdev) (2.6.0)\n",
            "Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.6/dist-packages (from nbformat>=4.4.0->nbdev) (0.2.0)\n",
            "Requirement already satisfied: traitlets>=4.1 in /usr/local/lib/python3.6/dist-packages (from nbformat>=4.4.0->nbdev) (4.3.3)\n",
            "Requirement already satisfied: jupyter-core in /usr/local/lib/python3.6/dist-packages (from nbformat>=4.4.0->nbdev) (4.6.2)\n",
            "Requirement already satisfied: bleach in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (3.1.0)\n",
            "Requirement already satisfied: defusedxml in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (0.6.0)\n",
            "Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (1.4.2)\n",
            "Requirement already satisfied: entrypoints>=0.2.2 in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (0.3)\n",
            "Requirement already satisfied: jinja2>=2.4 in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (2.11.1)\n",
            "Requirement already satisfied: pygments in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (2.1.3)\n",
            "Requirement already satisfied: testpath in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (0.4.4)\n",
            "Requirement already satisfied: mistune<2,>=0.8.1 in /usr/local/lib/python3.6/dist-packages (from nbconvert>=5.6.1->nbdev) (0.8.4)\n",
            "Requirement already satisfied: tqdm<5.0.0,>=4.10.0 in /usr/local/lib/python3.6/dist-packages (from thinc<7.1.0,>=7.0.8->spacy>=2.0.18->fastai) (4.28.1)\n",
            "Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from kiwisolver>=1.0.1->matplotlib->fastai) (45.2.0)\n",
            "Requirement already satisfied: decorator in /usr/local/lib/python3.6/dist-packages (from traitlets>=4.1->nbformat>=4.4.0->nbdev) (4.4.1)\n",
            "Requirement already satisfied: webencodings in /usr/local/lib/python3.6/dist-packages (from bleach->nbconvert>=5.6.1->nbdev) (0.5.1)\n",
            "Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.6/dist-packages (from jinja2>=2.4->nbconvert>=5.6.1->nbdev) (1.1.1)\n",
            "Installing collected packages: fastscript, nbdev, sentencepiece\n",
            "Successfully installed fastscript-0.1.4 nbdev-0.2.13 sentencepiece-0.1.85\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "QQS5KyqY6mif",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#hide\n",
        "# imports\n",
        "import re\n",
        "\n",
        "import sentencepiece as sp\n",
        "\n",
        "from fastai.text import *\n",
        "\n",
        "import warnings\n",
        "\n",
        "#no UserWarning display\n",
        "warnings.simplefilter(\"ignore\", UserWarning)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hUoqWrNO6rsU",
        "colab_type": "text"
      },
      "source": [
        "# Collecting, preparing and exploring the data\n",
        "\n",
        "You will be using the [CodeSearchNet Challenge](https://github.blog/2019-09-26-introducing-the-codesearchnet-challenge/) dataset from GitHub as it provides a large collection of clean code in multiple different languages. They have a really nice [example](https://github.com/github/CodeSearchNet/blob/master/notebooks/ExploreData.ipynb) on how to download and read in the data in their repo that you'll use to get started."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fY8EOz-_K5rT",
        "colab_type": "code",
        "outputId": "6d0c5d73-1967-423f-da5f-66d2c49201f2",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 663
        }
      },
      "source": [
        "! wget https://s3.amazonaws.com/code-search-net/CodeSearchNet/v2/java.zip\n",
        "! unzip java.zip"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "--2020-03-07 16:37:37--  https://s3.amazonaws.com/code-search-net/CodeSearchNet/v2/java.zip\n",
            "Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.179.149\n",
            "Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.179.149|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 1060569153 (1011M) [application/zip]\n",
            "Saving to: ‘java.zip’\n",
            "\n",
            "java.zip            100%[===================>]   1011M  88.0MB/s    in 12s     \n",
            "\n",
            "2020-03-07 16:37:49 (84.9 MB/s) - ‘java.zip’ saved [1060569153/1060569153]\n",
            "\n",
            "Archive:  java.zip\n",
            "   creating: java/\n",
            "   creating: java/final/\n",
            "   creating: java/final/jsonl/\n",
            "   creating: java/final/jsonl/train/\n",
            "  inflating: java/final/jsonl/train/java_train_12.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_9.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_3.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_5.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_7.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_1.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_10.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_14.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_0.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_6.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_8.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_15.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_2.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_4.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_13.jsonl.gz  \n",
            "  inflating: java/final/jsonl/train/java_train_11.jsonl.gz  \n",
            "   creating: java/final/jsonl/test/\n",
            "  inflating: java/final/jsonl/test/java_test_0.jsonl.gz  \n",
            "   creating: java/final/jsonl/valid/\n",
            "  inflating: java/final/jsonl/valid/java_valid_0.jsonl.gz  \n",
            "  inflating: java_dedupe_definitions_v2.pkl  \n",
            "  inflating: java_licenses.pkl       \n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Yh5-rUE-7AHK",
        "colab_type": "text"
      },
      "source": [
        "`jsonl_list_to_dataframe` method is directly from the CodeSearchNet Challenge example code and `get_dfs` is just a helper for you to properly grab the data into the correct training, validation, and testing splits. Let's see what your data looks like :D!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "v_OJH1lmLDD8",
        "colab_type": "code",
        "outputId": "8e806e4b-9939-4d67-83e9-8366a56cf9a5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 119
        }
      },
      "source": [
        "#hide\n",
        "path = Path('.')\n",
        "path.ls()"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[PosixPath('.config'),\n",
              " PosixPath('java.zip'),\n",
              " PosixPath('java_dedupe_definitions_v2.pkl'),\n",
              " PosixPath('java'),\n",
              " PosixPath('java_licenses.pkl'),\n",
              " PosixPath('sample_data')]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 4
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2C4PGnSrL1zO",
        "colab_type": "code",
        "outputId": "ab2b2491-8a37-46bb-82b7-b055cb723140",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 204
        }
      },
      "source": [
        "#collapse_show\n",
        "def jsonl_list_to_dataframe(file_list, columns=None):\n",
        "    \"\"\"Load a list of jsonl.gz files into a pandas DataFrame.\"\"\"\n",
        "    return pd.concat([pd.read_json(f,\n",
        "                                   orient='records', \n",
        "                                   compression='gzip',\n",
        "                                   lines=True)[columns] \n",
        "                      for f in file_list], sort=False)\n",
        "\n",
        "def get_dfs(path):\n",
        "    \"\"\"Grabs the different data splits and converts them into dataframes\"\"\"\n",
        "    dfs = []\n",
        "    for split in [\"train\", \"valid\", \"test\"]:\n",
        "        files = sorted((path/split).glob(\"**/*.gz\"))\n",
        "        df = jsonl_list_to_dataframe(files, [\"code\", \"docstring\"])\n",
        "        dfs.append(df)\n",
        "        \n",
        "    return dfs\n",
        "\n",
        "df_trn, df_val, df_tst = get_dfs(path/\"java/final/jsonl\")\n",
        "df_trn.head()"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>code</th>\n",
              "      <th>docstring</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>protected final void bindIndexed(Configuration...</td>\n",
              "      <td>Bind indexed elements to the supplied collecti...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>public void setServletRegistrationBeans(\\n\\t\\t...</td>\n",
              "      <td>Set {@link ServletRegistrationBean}s that the ...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>public void addServletRegistrationBeans(\\n\\t\\t...</td>\n",
              "      <td>Add {@link ServletRegistrationBean}s for the f...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>public void setServletNames(Collection&lt;String&gt;...</td>\n",
              "      <td>Set servlet names that the filter will be regi...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>public void addServletNames(String... servletN...</td>\n",
              "      <td>Add servlet names for the filter.\\n@param serv...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "                                                code                                          docstring\n",
              "0  protected final void bindIndexed(Configuration...  Bind indexed elements to the supplied collecti...\n",
              "1  public void setServletRegistrationBeans(\\n\\t\\t...  Set {@link ServletRegistrationBean}s that the ...\n",
              "2  public void addServletRegistrationBeans(\\n\\t\\t...  Add {@link ServletRegistrationBean}s for the f...\n",
              "3  public void setServletNames(Collection<String>...  Set servlet names that the filter will be regi...\n",
              "4  public void addServletNames(String... servletN...  Add servlet names for the filter.\\n@param serv..."
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6Qaf5OG0RmxY",
        "colab_type": "text"
      },
      "source": [
        "You are going to only use a small subset of the data in order to train your model in a reasonable time. However, if you want to adjust the amount of data used you can just adjust the sample size."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zEbfaYTZXHEc",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "sample = 0.2\n",
        "df_trn = df_trn.sample(frac = sample)\n",
        "df_val = df_val.sample(frac = sample)\n",
        "df_tst = df_tst.sample(frac = sample)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r3sxnOlI7Hg2",
        "colab_type": "text"
      },
      "source": [
        "Awesome! Now that you have the data, there's a few other preprocessing steps you need to perform. First we are going to remove any non-english comments. Next, you will also remove the JavaDocs, i.e., any line with an @ symbol or curly braces, as that will significantlly lessen the amount of learning your model will have to do. This also works out well since the JavaDoc syntax can usually be autogenerated from the method's signature."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VJQltNwLffP7",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "# From https://stackoverflow.com/a/27084708/5768407\n",
        "def isASCII(s):\n",
        "    try:\n",
        "        s.encode(encoding='utf-8').decode('ascii')\n",
        "    except UnicodeDecodeError:\n",
        "        return False\n",
        "    else:\n",
        "        return True\n",
        "\n",
        "df_trn = df_trn[df_trn['docstring'].apply(lambda x: isASCII(x))]\n",
        "df_val = df_val[df_val['docstring'].apply(lambda x: isASCII(x))]\n",
        "df_tst = df_tst[df_tst['docstring'].apply(lambda x: isASCII(x))]"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fe1eU1ZAfg7B",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def filter_jdocs(df):\n",
        "    methods = []\n",
        "    comments = []\n",
        "    for i, row in progress_bar(list(df.iterrows())):\n",
        "        comment = row[\"docstring\"]\n",
        "        # Remove {} text in comments from https://stackoverflow.com/questions/14596884/remove-text-between-and-in-python/14598135\n",
        "        comment = re.sub(\"([\\{\\[]).*?([\\)\\}])\", '', comment)\n",
        "        \n",
        "        \n",
        "        cleaned = []\n",
        "        for line in comment.split('\\n'):\n",
        "            if \"@\" in line: break\n",
        "            cleaned.append(line)\n",
        "        comments.append('\\n'.join(cleaned))\n",
        "        methods.append(row[\"code\"])\n",
        "    new_df = pd.DataFrame(zip(methods, comments), columns = [\"code\", \"docstring\"])\n",
        "\n",
        "    return new_df\n",
        "\n",
        "df_trn = filter_jdocs(df_trn);\n",
        "df_val = filter_jdocs(df_val);\n",
        "df_tst = filter_jdocs(df_tst);"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qPFZX5XAs1tH",
        "colab_type": "text"
      },
      "source": [
        "Now you are going to remove any empty comments or duplicate comments for your datasets."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7sPVGcPqg5AA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "df_trn = df_trn[~(df_trn['docstring'] == '')]\n",
        "df_val = df_val[~(df_val['docstring'] == '')]\n",
        "df_tst = df_tst[~(df_tst['docstring'] == '')]"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "u1la_52mg5ki",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "df_trn = df_trn[~df_trn['docstring'].duplicated()]\n",
        "df_val = df_val[~df_val['docstring'].duplicated()]\n",
        "df_tst = df_tst[~df_tst['docstring'].duplicated()]"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N9BaitfEs8G0",
        "colab_type": "text"
      },
      "source": [
        "Not bad, still leaves you with plenty of data to learn with!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "whEttI5wbnlx",
        "colab_type": "code",
        "outputId": "2960cc3d-8532-4724-80f6-1d4a76fd4b9e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "len(df_trn), len(df_val), len(df_tst)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(73755, 2427, 4615)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 13
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Dds1Ucrw9HQo",
        "colab_type": "text"
      },
      "source": [
        "## Exploring your data!\n",
        "As a good machine learning practitioner, it is extremely important to be careful with your data. This includes checking for biases, duplicates, and also describing the data that you have. Not doing so is setting you up for disaster. I have personally experienced such travesty when working on one of my own research projects where I forgot to check for duplicates before splitting my data. Sadly for me and all my restless nights working on the project, the data was full of duplicates and so my test set was contaminated with data points from my training set, which lead to inflated evaluation metrics :(.\n",
        "\n",
        "**Always explore your data!**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nJc46Yp4-7es",
        "colab_type": "text"
      },
      "source": [
        "You'll be doing some basic descriptive statistics for this Exploratory Data Analysis (EDA), which just means calculating some means, medians, and standard deviations for different views of your data. The first view you will be exploring is the tokens that make up your code and comments. To tokenize your data into these tokens you will use something called Byte Pair Encoding, which has shown great results for tokenizing both natural language and code as shown in Karampatsis and Sutton's paper [\"Maybe Deep Neural Networks are the Best Choice for Modeling Source Code.\"](https://arxiv.org/abs/1903.05734)\n",
        "\n",
        "A great resource on learning more about how Byte Pair Encoding works is this [blog post](https://towardsdatascience.com/byte-pair-encoding-the-dark-horse-of-modern-nlp-eb36c7df4f10) by Akashdeep Singh Jaswal and this [Youtube video](https://youtu.be/9oTHFx0Gg3Q) by Christopher Manning. Specifically, you will be using the awesome library by Google called [sentencepiece](https://github.com/google/sentencepiece)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_skpRX--VTy4",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def df_to_txt_file(df, output, col):\n",
        "    \"\"\"Converts a dataframe and converts it into a text file that SentencePiece can use to train a BPE model\"\"\"\n",
        "    with open(output/'text.txt', 'w') as f:\n",
        "        f.write('\\n'.join(list(df[col])))\n",
        "    return output/'text.txt'\n",
        "\n",
        "def gen_sp_model(df, output, tokenizer_name, col):\n",
        "    \"\"\"Trains a SentencePiece BPE model from a pandas dataframe\"\"\"\n",
        "    fname = df_to_txt_file(df, output, col)\n",
        "    sp.SentencePieceTrainer.train(f'--input={fname} --model_prefix={output / tokenizer_name} --hard_vocab_limit=false')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1IGym5z04BjU",
        "colab_type": "text"
      },
      "source": [
        "To use Byte Pair Encoding, you have to train the tokenizer on your data. However, no need to train your BPE on all of your data, so you will just be doing it on a subset (10%) of your training data. You are picking to train the BPE model from your training set to not perform any inadvertant data snooping by biasing your BPE model to tokenize more common words in your validation or testing set. This also will help show that you are indeed solving the out of vocabulary problem because you will most likely encounter words in your testing set that were not in your training set."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Np3-eQnMVSvj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "p_bpe = 0.1\n",
        "method_tokenizer = \"method_bpe\"\n",
        "gen_sp_model(df_trn.sample(frac = p_bpe), path, method_tokenizer, col = \"code\")\n",
        "comment_tokenizer = \"comment_bpe\"\n",
        "gen_sp_model(df_trn.sample(frac = p_bpe), path, comment_tokenizer, col = \"docstring\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VSCT3OvQAlxL",
        "colab_type": "code",
        "outputId": "8f91f524-a6d5-4290-b78e-bd66e435e46d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "#hide\n",
        "method_spm = sp.SentencePieceProcessor()\n",
        "method_spm.Load(str(path / (method_tokenizer + \".model\")))\n",
        "comment_spm = sp.SentencePieceProcessor()\n",
        "comment_spm.Load(str(path / (comment_tokenizer + \".model\")))"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "True"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 16
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Xwxepls7t8wh",
        "colab_type": "text"
      },
      "source": [
        "Now that you have the ability to tokenize your text, let us explore! First you will just generate the frequency of each of your tokens and while you are at it, let's collect how long your methods are by via the common software metric Lines of Code (LOC)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hK4-mZLC9JLv",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#hide\n",
        "from collections import Counter\n",
        "from statistics import mean, median, stdev"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Q4R8NX3__6I-",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def get_counter_and_lens(df, spm, col):\n",
        "    toks = []\n",
        "    locs = []\n",
        "    for i, row in progress_bar(list(df.iterrows())):\n",
        "        toks.extend(spm.EncodeAsPieces(row[col]))\n",
        "        locs.append(len(row[col].split('\\n')))\n",
        "            \n",
        "    cnt = Counter()\n",
        "    for tok in progress_bar(toks):\n",
        "        cnt[tok] += 1  \n",
        "    return list(map(len, toks)), cnt, locs\n",
        "\n",
        "code_lens, code_cnt, locs = get_counter_and_lens(df_trn, method_spm, 'code')\n",
        "comment_lens, comment_cnt, _ = get_counter_and_lens(df_trn, method_spm, 'docstring')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NFVtLdJdAxvE",
        "colab_type": "code",
        "outputId": "50395594-b345-4bdd-a9e3-6512c8004cb8",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 433
        }
      },
      "source": [
        "#collapse_show\n",
        "def plot_counts(counts, top_k = 30):\n",
        "    labels, values = zip(*counts.most_common()[:top_k])\n",
        "\n",
        "    indexes = np.arange(len(labels))\n",
        "    width = 1\n",
        "    plt.figure(num=None, figsize=(22, 4), dpi=60, facecolor='w', edgecolor='k')\n",
        "    plt.bar(indexes, values, width)\n",
        "    plt.xticks(indexes + width * 0.5, labels)\n",
        "    plt.show()\n",
        "\n",
        "plot_counts(code_cnt, top_k = 30)\n",
        "plot_counts(comment_cnt, top_k = 30)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAABDAAAADQCAYAAADxn5GHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAAJOgAACToB8GSSSgAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3dfXBU9b3H8U8SEgOXydONg3iTQKvR\nUKdYBSzssmEJEGMAr6BZsAKm1taits110CjlghBokY4MjuLTADdwBdpdSmk1mkDANQ9LRVLAUkGB\nFptEsdA8IRdMNtn7B7PbAEnkIbvnJLxfM5mBsye/7/dsds/ufvZ3zgnz+Xw+AQAAAAAAmFi40Q0A\nAAAAAAB8HQIMAAAAAABgegQYAAAAAADA9AgwAAAAAACA6RFgAAAAAAAA0+tjdAOXYtiwYbrhhhuM\nbgMAAAAAAATJkSNHVFVVdcHyHhVg3HDDDXI6nUa3AQAAAAAAgsThcHS4nENIAAAAAACA6RFgAAAA\nAAAA0yPAAAAAAAAApkeAAQAAAAAATI8AAwAAAAAAmB4BBgAAAAAAMD0CDAAAAAAAYHp9jG7gajH4\n6SLDah9dOtGw2gAAAAAAdAdmYAAAAAAAANMjwAAAAAAAAKZHgAEAAAAAAEyPAAMAAAAAAJje1wYY\nbrdb48aN09ixY/W73/1OFRUVslgsGj16tP785z9Lko4dO6bMzExZrVa98cYbkqTW1lY99NBDstls\nysvLC4z3wgsvyGq16u6771ZTU5MkdTgmAAAAAACAX5cBxunTp/X888/rnXfe0bvvvqspU6bo5z//\nuYqKirRhwwbl5+dLkp577jk99dRTeu+997Ry5UqdOXNGb731lq6//nqVl5fr1KlT2rlzp06cOKE/\n/OEPqqio0LRp07Ry5UpJ6nBMAAAAAAAAvy4DjJ07d6pv376aPHmypkyZos8//1wRERGKj49XSkqK\n6urqJEm7du1SRkaG+vTpo+HDh2v//v3yeDzKzMyUJGVlZamyslIffPCBxowZo7CwsMCy06dPdzgm\nAAAAAACAX5+ubvziiy90+PBh/fGPf1RpaakWLFigmJiYf/1ynz5qbm5WS0uLwsPPZiGxsbGqq6tT\nfX19YN2LXdZ+zKioqG7fWAAAAAAA0DN1OQMjLi5OVqtVUVFRGjdunPbs2RM4b4Ukeb1eRUVFKTIy\nUm1tbZKkxsZGJSQkKC4uLrDuxS5rP6afy+WSw+GQw+FQdXV19205AAAAAADoMboMMEaMGKEDBw7I\n5/Np7969+ta3viWv16uGhgZVV1crISEhsJ7b7ZbX61VVVZVuueUWWSwWlZaWSpJKSkpktVo1YsQI\nlZWVnbOsX79+HY7pl5OTI6fTKafTqeTk5GDcBwAAAAAAwOS6PIQkMTFRU6ZMCZy3Ys2aNaqtrVV2\ndrbCwsL08ssvS5Ly8/M1a9YszZs3Tz/+8Y/Vt29fTZo0SVu2bJHNZtNtt92mUaNGSZImTpwoq9Wq\n+Ph4rV+/XpK0ePHiC8YEAAAAAADwC/P5fD6jm7hYDodDTqfT6DYuy+CniwyrfXTpRMNqAwAAAABw\nKTr77N/lISQAAAAAAABmQIABAAAAAABMjwADAAAAAACYHgEGAAAAAAAwPQIMAAAAAABgegQYAAAA\nAADA9AgwAAAAAACA6RFgAAAAAAAA0yPAAAAAAAAApkeAAQAAAAAATI8AAwAAAAAAmB4BBgAAAAAA\nMD0CDAAAAAAAYHoEGAAAAAAAwPQIMAAAAAAAgOkRYAAAAAAAANMjwAAAAAAAAKZHgAEAAAAAAEyP\nAAMAAAAAAJhelwHG0aNHde2118put8tut+v48eNyuVyyWCwaN26campqJEkHDx5Uenq6LBaLtm/f\nLkk6deqUpk6dqtGjR2vZsmWBMfPz82Wz2TRz5ky1tLRIUodjAgAAAAAA+H3tDIwxY8bI7XbL7XYr\nPj5ey5cvl9vt1qJFi1RQUCBJmjt3rlavXq3i4mLNnz9fkrRq1SplZ2eroqJCO3bsUG1trfbt26fa\n2lqVl5crLS1NmzZtktfr7XBMAAAAAAAAv68NMCorK2Wz2TR37lwdOnRIQ4YMUVRUlKxWqz788ENJ\n0meffabU1FTFxMQoISFBJ06ckMfjUWZmpiRpwoQJ2rlz5znLsrKyVFlZ2emYAAAAAAAAfl0GGAMH\nDtThw4dVVlamf/zjH9q8ebNiYmICt7e2tkqS2traAstiY2NVV1en+vr6wLoXu6z9mAAAAAAAAH59\nurrxmmuu0TXXXCNJmjp1qgoLC9W/f//A7REREZKk8PB/5SCNjY1KSEhQXFycmpqaFBcXp8bGRg0a\nNEher1dNTU0drnf+mH4ul0sul0uSVF1dfSXbCgAAAAAAeqguZ2CcPHky8O/y8nJNnDhRBw4cUHNz\nszwej4YOHSrp7EyNI0eO6OTJk6qrq1NiYqIsFotKS0slSaWlpRo5cuQ5y0pKSmS1WpWamtrhmH45\nOTlyOp1yOp1KTk7u1o0HAAAAAAA9Q5czMCoqKjRv3jz169dP3/jGN1RQUKDo6GjZ7XZFR0dr7dq1\nkqQlS5YoNzdXra2tWrhwoSTp4Ycf1owZM7RmzRpNmjRJSUlJSkpK0oABA2Sz2ZSSkqI5c+YoMjJS\neXl5F4wJAAAAAADgF+bz+XxGN3GxHA6HnE6n0W1clsFPFxlW++jSiYbVBgAAAADgUnT22f9rr0IC\nAAAAAABgNAIMAAAAAABgegQYAAAAAADA9AgwAAAAAACA6RFgAAAAAAAA0yPAAAAAAAAApkeAAQAA\nAAAATI8AAwAAAAAAmB4BBgAAAAAAMD0CDAAAAAAAYHoEGAAAAAAAwPQIMAAAAAAAgOkRYAAAAAAA\nANMjwAAAAAAAAKZHgAEAAAAAAEyPAAMAAAAAAJgeAQYAAAAAADA9AgwAAAAAAGB6fYxuAME3+Oki\nw2ofXTrRsNoAAAAAgN6DGRgAAAAAAMD0LirA2Lhxo6699lpJksvlksVi0bhx41RTUyNJOnjwoNLT\n02WxWLR9+3ZJ0qlTpzR16lSNHj1ay5YtC4yVn58vm82mmTNnqqWlpdMxAQAAAAAA/L42wGhtbZXL\n5VJycrK8Xq+WL18ut9utRYsWqaCgQJI0d+5crV69WsXFxZo/f74kadWqVcrOzlZFRYV27Nih2tpa\n7du3T7W1tSovL1daWpo2bdrU6ZgAAAAAAAB+XxtgbNy4UTk5OQoPD9ehQ4c0ZMgQRUVFyWq16sMP\nP5QkffbZZ0pNTVVMTIwSEhJ04sQJeTweZWZmSpImTJignTt3nrMsKytLlZWVnY4JAAAAAADg12WA\n0draKqfTqWnTpkmS6uvrFRMTc87tktTW1hZYFhsbq7q6unPWvdhl7cf0c7lccjgccjgcqq6uvpJt\nBQAAAAAAPVSXVyF544035HA4FB5+NueIi4tTU1NT4PaIiAhJCtwuSY2NjUpISAisGxcXp8bGRg0a\nNEherzfw++evd/6Yfjk5OcrJyZEkORyOK9lWAAAAAADQQ3U5A+Ojjz7SunXrlJWVpUOHDunFF1/U\ngQMH1NzcLI/Ho6FDh0qSBg4cqCNHjujkyZOqq6tTYmKiLBaLSktLJUmlpaUaOXLkOctKSkpktVqV\nmpra4ZgAAAAAAAB+Xc7AeO655wL/Hj58uF555RX95je/kd1uV3R0tNauXStJWrJkiXJzc9Xa2qqF\nCxdKkh5++GHNmDFDa9as0aRJk5SUlKSkpCQNGDBANptNKSkpmjNnjiIjI5WXl3fBmAAAAAAAAH5h\nPp/PZ3QTF8vhcMjpdBrdxmUZ/HSR0S0Y4ujSiUa3AAAAAADoQTr77P+1VyEBAAAAAAAwGgEGAAAA\nAAAwPQIMAAAAAABgegQYAAAAAADA9AgwAAAAAACA6RFgAAAAAAAA0yPAAAAAAAAApkeAAQAAAAAA\nTK+P0Q2gdxv8dJFhtY8unWhYbQAAAABA92IGBgAAAAAAMD0CDAAAAAAAYHoEGAAAAAAAwPQIMAAA\nAAAAgOkRYAAAAAAAANMjwAAAAAAAAKZHgAEAAAAAAEyPAAMAAAAAAJgeAQYAAAAAADA9AgwAAAAA\nAGB6XQYYX3zxhSwWi8aMGaOMjAx9/vnnqqiokMVi0ejRo/XnP/9ZknTs2DFlZmbKarXqjTfekCS1\ntrbqoYceks1mU15eXmDMF154QVarVXfffbeampokqcMxAQAAAAAA/LoMMBITE1VRUaH33ntPs2bN\n0urVq/Xzn/9cRUVF2rBhg/Lz8yVJzz33nJ566im99957Wrlypc6cOaO33npL119/vcrLy3Xq1Cnt\n3LlTJ06c0B/+8AdVVFRo2rRpWrlypSR1OCYAAAAAAIBflwFGRESEwsPPrnLy5EndcMMNioiIUHx8\nvFJSUlRXVydJ2rVrlzIyMtSnTx8NHz5c+/fvl8fjUWZmpiQpKytLlZWV+uCDDzRmzBiFhYUFlp0+\nfbrDMQEAAAAAAPz6fN0Ke/fu1SOPPKKGhgZt3bpVv/nNb/71y336qLm5WS0tLYGgIzY2VnV1daqv\nr1dMTMwlLWs/ZlRUlCTJ5XLJ5XJJkqqrq7tpswEAAAAAQE/ytQHGd77zHb3//vtyOp1asmRJ4LwV\nkuT1ehUVFaXIyEi1tbUpPDxcjY2NSkhIUFxcXGDd9ssOHz58wbKOxvTLyclRTk6OJMnhcHTPVgMA\nAAAAgB6ly0NImpubA/+OjY1V//795fV61dDQoOrqaiUkJEiSRowYIbfbLa/Xq6qqKt1yyy2yWCwq\nLS2VJJWUlMhqtWrEiBEqKys7Z1m/fv06HBMAAAAAAMCvyxkYe/fu1Zw5cxQREaHo6GitWbNGhw4d\nUnZ2tsLCwvTyyy9LkvLz8zVr1izNmzdPP/7xj9W3b19NmjRJW7Zskc1m02233aZRo0ZJkiZOnCir\n1ar4+HitX79ekrR48eILxgQAAAAAAPAL8/l8PqObuFgOh0NOp9PoNi7L4KeLjG7hqnN06USjWwAA\nAAAAXKLOPvt3eQgJAAAAAACAGRBgAAAAAAAA0yPAAAAAAAAApkeAAQAAAAAATK/Lq5AAPZmRJ07l\nBKIAAAAA0L0IMIAguFqvOkNwAwAAACBYOIQEAAAAAACYHgEGAAAAAAAwPQIMAAAAAABgegQYAAAA\nAADA9AgwAAAAAACA6RFgAAAAAAAA0yPAAAAAAAAApkeAAQAAAAAATI8AAwAAAAAAmF4foxsA0HsM\nfrrI6BYMcXTpRKNbAAAAAHo9ZmAAAAAAAADTI8AAAAAAAACmR4ABAAAAAABMr8sAY9euXRo1apTS\n09N1//33q6WlRS6XSxaLRePGjVNNTY0k6eDBg0pPT5fFYtH27dslSadOndLUqVM1evRoLVu2LDBm\nfn6+bDabZs6cqZaWFknqcEwAAAAAAAC/LgOM5ORk7dixQ2VlZRo8eLB+//vfa/ny5XK73Vq0aJEK\nCgokSXPnztXq1atVXFys+fPnS5JWrVql7OxsVVRUaMeOHaqtrdW+fftUW1ur8vJypaWladOmTfJ6\nvR2OCQAAAAAA4NdlgDFw4ED17dtXkhQVFaWPP/5YQ4YMUVRUlKxWqz788ENJ0meffabU1FTFxMQo\nISFBJ06ckMfjUWZmpiRpwoQJ2rlz5znLsrKyVFlZqUOHDnU4JgAAAAAAgN9FXUb1008/1datW7V0\n6VIdP348sLy1tVWS1NbWFlgWGxururo61dfXKyYm5oJlAwcO7HS99mMCQE9h5OVjuYQrAAAArhZf\nG2A0NTVp5syZKiwsVGtrq5qamgK3RURESJLCw/81kaOxsVEJCQmKi4tTU1OT4uLi1NjYqEGDBsnr\n9QZ+//z1zh/Tz+VyyeVySZKqq6uvYFMBAAAAAEBP1eUhJF6vV9OnT9eCBQt08803KzU1VQcOHFBz\nc7M8Ho+GDh0q6eyhJkeOHNHJkydVV1enxMREWSwWlZaWSpJKS0s1cuTIc5aVlJTIarV2OqZfTk6O\nnE6nnE6nkpOTg3EfAAAAAAAAk+tyBsbGjRv1/vvvq6CgQAUFBZo9e7by8vJkt9sVHR2ttWvXSpKW\nLFmi3Nxctba2auHChZKkhx9+WDNmzNCaNWs0adIkJSUlKSkpSQMGDJDNZlNKSormzJmjyMjIDscE\nAAAAAADwC/P5fD6jm7hYDodDTqfT6DYui5HHyAPovTgHBgAAAHqbzj77X9RJPAEA5sQJRAEAAHC1\n6PIcGAAAAAAAAGZAgAEAAAAAAEyPAAMAAAAAAJge58AAAFwWzr8BAACAUCLAAAD0OIQnAAAAVx8O\nIQEAAAAAAKZHgAEAAAAAAEyPAAMAAAAAAJgeAQYAAAAAADA9AgwAAAAAAGB6BBgAAAAAAMD0uIwq\nAACXgEu4AgAAGIMZGAAAAAAAwPSYgQEAQA/B7A8AAHA1YwYGAAAAAAAwPQIMAAAAAABgegQYAAAA\nAADA9AgwAAAAAACA6XUZYDQ2NuqOO+5Q//79tX//fkmSy+WSxWLRuHHjVFNTI0k6ePCg0tPTZbFY\ntH37dknSqVOnNHXqVI0ePVrLli0LjJmfny+bzaaZM2eqpaWl0zEBAAAAAAD8ugww+vXrp6KiIt13\n332SJK/Xq+XLl8vtdmvRokUqKCiQJM2dO1erV69WcXGx5s+fL0latWqVsrOzVVFRoR07dqi2tlb7\n9u1TbW2tysvLlZaWpk2bNnU6JgAAAAAAgF+XAUZkZKSuvfbawP8PHTqkIUOGKCoqSlarVR9++KEk\n6bPPPlNqaqpiYmKUkJCgEydOyOPxKDMzU5I0YcIE7dy585xlWVlZqqys7HRMAAAAAAAAv0s6B0Z9\nfb1iYmIC/29tbZUktbW1BZbFxsaqrq7unHUvdln7MQEAAAAAAPz6XMrKcXFxampqCvw/IiJCkhQe\n/q8cpLGxUQkJCYF14+Li1NjYqEGDBsnr9QZ+//z1zh/Tz+VyyeVySZKqq6svcfMAAAAAAEBvcEkz\nMFJTU3XgwAE1NzfL4/Fo6NChkqSBAwfqyJEjOnnypOrq6pSYmCiLxaLS0lJJUmlpqUaOHHnOspKS\nElmt1k7H9MvJyZHT6ZTT6VRycnJ3bDMAAAAAAOhhvnYGRnZ2tvbu3auPP/5YjzzyiPLy8mS32xUd\nHa21a9dKkpYsWaLc3Fy1trZq4cKFkqSHH35YM2bM0Jo1azRp0iQlJSUpKSlJAwYMkM1mU0pKiubM\nmaPIyMgOxwQAAAAAAPAL8/l8PqObuFgOh0NOp9PoNi7L4KeLjG4BAIAe6ejSiUa3AAAAQqizz/6X\ndAgJAAAAAACAES7pJJ4AAAChxizG0GPWCwDAjJiBAQAAAAAATI8ZGAAAADiHkbNemP0BAOgMAQYA\nAABMg/AEANAZDiEBAAAAAACmxwwMAAAAQMz+AACzI8AAAAAADMbVdkKP0AjoeQgwAAAAAFx1mHED\n9DwEGAAAAAAQQoQnwOUhwAAAAACAq8TVergSwU3vQIABAAAAAOjVCG56By6jCgAAAAAATI8AAwAA\nAAAAmB4BBgAAAAAAMD0CDAAAAAAAYHoEGAAAAAAAwPQIMAAAAAAAgOkRYAAAAAAAANMjwAAAAAAA\nAKZnmgAjPz9fNptNM2fOVEtLi9HtAAAAAAAAEzFFgLFv3z7V1taqvLxcaWlp2rRpk9EtAQAAAAAA\nEzFFgOHxeJSZmSlJysrKUmVlpcEdAQAAAAAAM+ljdAOSVF9fr4EDB0qSYmNjVVdXF7jN5XLJ5XJJ\nknbv3i2Hw2FIj1fqDgNrV1dXKzk5mdrUpja1qU1talOb2tSmNrWpfRXVHjVqsWG1r8SRI0c6vsFn\nAitXrvStXbvW5/P5fLt37/Y99thjBnfUu+Tk5FCb2tSmNrWpTW1qU5va1KY2tando5niEBKLxaLS\n0lJJUklJiaxWq8EdAQAAAAAAM4l49tlnnzW6ieuuu04ej0cFBQVqbm7WM888o4iICKPb6lVuueUW\nalOb2tSmNrWpTW1qU5va1KY2tXusMJ/P5zO6CQAAAAAAgK6Y4hASAAAAAACArhBgIKjefvttrV69\n2ug2DOF0OnX77bdflZcF9vl8ysrK0vTp041uJeiqq6v1X//1X0a3IUmaPn26vF5v0OuYaZsRWnl5\neTp9+rT+7//+T3a7XePHjw96zddffz3oNS7Fli1b9I9//COkNefNm6d9+/ZJkh588MFzrtbWnfyv\n2dOmTRMTdC80fPhwSVJubq72799vcDfBYbfb5Xa7ZYIjzEPK/7dtz/933rt3r1555ZVuq+V/ntnt\ndj377LNyu93dNrbRfv3rX2vkyJFKT0/XlClTJElut1uffPJJh+v7X1OCzQz7tu5+HF3NCDAQVK+9\n9poeeOABo9swhMPh0OLFi/XOO+8Y3UrIHTt2TOHh4fr1r39tdCtBl5ycrGPHjqmhocHQPsrKyvTt\nb39bffoE/+rYZtlmhN6KFSvUt29f7du3T7feemvgBNxXqq2trdPbLiXA6Gqc7mJEgOG/v9va2tTQ\n0KCEhISg1PG/ZlssFm3dujUoNYDOFBYWmvLD/He+8x3Nnj2728brze+Nly5dqrKyMpWVlWnNmjWS\nOg8w2traAq8pwWaGfVt3P46uZgQYCJqGhga1trYqOjra6FYM069fP505c8boNkLuq6++0r/9278Z\n3UbI2Gw2lZSUGNrDli1bNGHChJDV829zqL49MYM//vGP+u53v6uxY8ca9u3kI488YkhdP7vdri+/\n/FI/+9nPtHnzZj366KNXPN5TTz2lO++8Uz6fTz/5yU80duxYjR8/XjU1NXrllVf08ccfy263a8eO\nHYH6knTffffp6NGjKiws1PTp0zV58mQVFxfr9ttv1+OPP67vfve7eu655y66F6/Xq/vuu0/jx4/X\nY489ptzcXBUXF8tms8lisWjjxo3629/+puLiYn3/+9/XU089dUXbfrG++OILDRgwQJK0e/duDRs2\nLCh12r9mT5gwQVu2bJF09kPlzp07g1JTOvsYeOKJJ5Senq7HH39cknTmzBnNmDFDGRkZuvvuu9XU\n1KTXX39dGzZs0OnTp3XNNdfo73//u9577z0tWLCg2+sXFhbqpZdekiS99dZbIX++Nzc3q6WlJaQ1\n/datW6c77rgjcF/0ZHa7XT/5yU+Unp6un/3sZ5I6/9t++eWXuv/++zV8+HBt2LDhnHHcbrfmzJkj\n6ew3+SNHjpTdbtf//u//XnJP7Z9n69at0+OPP6477rjjCrby4rW0tKi5uTmoNU6fPi2Px6PW1lbF\nx8fr9OnTKiws1DPPPKNZs2bJ7XZr8uTJmjJligoLCwP79MLCQt17772aPHmyRowYoc8//1yS9Itf\n/EKjRo3ST3/6U91+++2X1VNn+7ZQa/84wpUhwEDQfPLJJxo8eLDRbRgqNjZWNTU1RrcRctXV1YqN\njQ1pzV/96ley2+3n/CxdujQktb/5zW/qo48+Ckmtzhw8eFDf/OY3Q1bPv82h+vbEDIqKirRgwQK9\n++67mj9/viE9vPbaa4bUPd+yZcs0bdo0vfzyy1c81p133qlt27apqKhI8fHxevfdd7VkyRItXbpU\ns2fP1s033yy3262MjIxOx4iMjNSbb76p7OxsNTQ06Mknn5TH47mkDxhbtmzRTTfdpNLSUt16663y\n+XwqKCjQ9u3bVV5erpdeekkpKSnKysrS//zP/2jZsmVXvO0Xo6SkRHfeeackqbi4WHfddVdQ6rR/\nzW6/T8vNzdWoUaOCUtPvnnvuUVlZmaqqqtTY2KhVq1YpIyNDO3bs0AMPPKDXX39dNptN5eXlev/9\n95WRkaHy8nKVl5crPT292+sb5S9/+YueeOIJZWRkGNZHSkqK+vXrp8TEREPqd7fJkyerrKxMX3zx\nhf70pz91ul5NTY1WrlypyspKLVu2TK2trRes09bWpmeeeUZbt26V2+2+rFkU7Z9nKSkpSkxMVL9+\n/S55nMvR2NiojIwMPfHEE/rLX/4SlBrr16/Xiy++qBtvvFELFy5U3759lZubq1/+8pdat25doI/N\nmzfroYceOud3Y2Nj9eabb+qhhx6Sy+XSsWPHVFJSIo/Ho8cff1z19fWX1VNn+7berqv3xUa+Z+4O\nwZ9rDFzFbr31Vn388ceaPn36VXE4hXT2Q96DDz6oTZs2hbTuk08+qSeffDKkNXF1eeyxx7R48WKt\nX79eDzzwgLKzs41uqVcYMWKEJOmjjz7S7373O5WVlcnn8yk5OfmCdcPCwgL/bn8cs38MSYqPj9eg\nQYMk6ZJmAB4+fDgwu2HYsGHasmWLPvnkE2VmZko6+y3e8ePHL2HLuse2bdv04osvSjo7A2PevHkh\n7yHYbrvtNknSf/zHf6ihoUEfffSRPvjgA61bt04tLS2y2WxKS0vTgQMHVFZWprlz52rDhg2qrq7W\nE0880e31O3ucBUNLS4sKCwu1adMmDRo0SN///ve1fPnyoNY0m0WLFmnHjh06duyYoqOjFRcXp+99\n73v60Y9+dMVj+5/TI0aM0KFDhzr9237jG98IHJqVnJysEydOXDDW8ePHlZycrJiYGElSeHjP+h44\nMTFRFRUV8ng8euGFF/Tpp5/qvvvuU25uriIjI7ulxvDhw/Xb3/5Wzc3NysrK0sGDBztcp/3fwc//\nPExOTlZVVZWOHj2qoUOHKiwsTDfddJP69+/fLT1eLbp6X9zT3zMTYCBobrrpJh09etToNgy1d+9e\npaamhjS8qKmpUVJSUsjqnW/ixInavHmz1q1bJ7vdHrK6v/rVr1RUVHTOsqysLD399NNBr/3Xv/5V\nQ4YMCXqdrtx8883661//GrJvzfzbXFtbq+uvv77DNyO9TWxsrF566SU1Nzdr2LBhhgQYRj+/g8H/\nISAtLU0Oh0P//d//LUmBKfTtH1vx8fGqqanRjTfeeM43iO0/SFzuY/HGG2/Unj17dO+992rPnj1K\nTExUWlqatm7dqqioKLW0tCgyMlKRkZEdfjsbDG1tbWpqalJcXJzq6uoUHx8ftA9N7V+z2+/T6urq\nFB0dHdRvic//UJmWlqZRo0Zp5syZks4+FsLCwpSQkKDKykrNnz9fy5cv11dffdUtfZ1fPz4+XgcO\nHJCkwMlTg+XkyZN69dVXdS2onNYAAASPSURBVPvtt2v27NmBD3FXk/nz52v+/PkqLCzU4MGDu/W9\nw549ezR+/Hjt3r1bdrtdtbW1Hf5tjx49qvr6evXr10/V1dUdvpZee+21qqmp0Zdffqn+/furra3t\nkp+PZnhvbLFY1LdvX7388st69dVXde+993bbeXUOHTqk1NRURUVFKTY2Vj6f74J9Zmf32fnPw8GD\nB2v//v3y+Xw6fPhw4PDBS9XZvq236+p9sZHvmbtDz4oOcVmOHTt2xceIXo64uDiFh4dfleeA8Gtq\nalJKSkrI6nm9Xt1///0hq9eZlJSUkJ/g8cknn5Tb7T7nJ1Q74vLy8sAUb6Pcc8892rZtW8jq+bd5\nxowZQT+m1ixee+01paeny263Kzc3N+T1zfL8DpbJkyfrn//8p8aOHauMjIzAdOObb75Z9957ryor\nK/Xoo48qJydHDz74YOC8EN3lnnvu0cGDBzVu3Di9//77uuaaazRv3jxNmDBBY8eODUwXv+uuu5SX\nl6clS5Z0a/2O7Nq1KzC7ZOvWrUE9z0371+xt27bpP//zPyVJy5cvD+o5MDryox/9SNu2bVNGRoYy\nMjICJ90bPXp04PxK1113XdA+7I8fP14ej0fZ2dn69NNPg1LDLyEhQVVVVXr00Ue1evVqjR07VitW\nrNBXX30V1LpXi3feeUfp6elKTEzUsGHDOv3bJicn66c//amsVqvmzJmjiIiIC8YKDw/XkiVLNG7c\nOI0dO1br16+/5H6MfG/81VdfacWKFRo7dqxWr16tRx99VFVVVd16UuA5c+bIYrFo9OjRSktL05Ah\nQ5SRkaHnn38+cB6Si3XddddpwoQJGjVqlFasWHHZfXa2b+vtunpfbOR75u4Q5uM6WQiioqIiHTt2\nTD/4wQ9CUq/9JSRDcTWG82u2Fx4erjfffFNVVVVatGhRSHrZtWuX9u3bpx/+8IdBr9XVdh8/flwP\nPPBAt12h4GLrGjGds7q6Ws8//7xWrFgRknpdbf/999+v9evXB/2x79/m559/Xo899pheffXVoNbD\nWWZ5fl/O88xsz9vO+GdZvP7666qvr1d+fn7Iand0H1VVVSkxMVE33HCDSkpKNHz4cP37v/970Hrw\nv2Zv3bpVGzduVHh4uGbPnq0XX3zxivcrRj8GjK5/sc6cOaPNmzcrKysraFebkYy5P4JVs7NxMzIy\n9Pbbb5vu0INQvzf2q6urU3FxsaZOnXrFJ9gP1ePHv0/+5JNPlJeXp7fffvuyxulo39Zb9JR9W7fy\nAb2IpMBPKFRXV59Ts/3PkCFDfOnp6b49e/aEpJdQ6mq7FyxY4PP5fL5Zs2b5HA5HyOv2Zlf79vt8\nPl9DQ4NvzJgx5/zs3r3b6LZ6le5+nPWkx+1dd93ls9lsvvHjx/v++c9/hqxuT7qPLofR22d0fbMx\n4v4IVs2uxh0zZozv5MmT3bcR8Pl8oX38PPPMM7709HTfiBEjfB988EG3jt0bXK37NmZgAAAAAAAA\n0+ul80oAAAAAAEBvQoABAAAAAABMjwADAAAAAACYHgEGAAAAAAAwPQIMAAAAAABgev8PFLEHrpOd\noccAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 1320x240 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAABDAAAADQCAYAAADxn5GHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAAJOgAACToB8GSSSgAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3de1zUdb7H8Tcgph0fgKytWYi2xabH\nsmOGAsMA3ogQ3dUEq9Ui85RmblYeKbc1lWNlF9pOma3rtdVtY3xUtvlIWkKSi23pEdS8HHPVRcxO\nHAjIMpjhe/7owSyYN3AuP+z1fDx6JN+Z+X0+37n8ZuY9399MgDHGCAAAAAAAwMIC/d0AAAAAAADA\nuRBgAAAAAAAAyyPAAAAAAAAAlkeAAQAAAAAALI8AAwAAAAAAWF4nfzfQFoMHD9bVV1/t7zYAAAAA\nAICXHDx4UNu3b//BeIcKMK6++mrl5ub6uw0AAAAAAOAlGRkZpx3nEBIAAAAAAGB5BBgAAAAAAMDy\nCDAAAAAAAIDlnTXAqK2t1ZAhQ9StWzft3r3bPX7kyBFdcskl7rF9+/YpISFBcXFx+uCDDyRJJ06c\n0Pjx4xUfH69nnnnGfdmsrCzZ7XZNnjxZjY2NkiSHw6G4uDiNGDFCR48e9fgkAQAAAABAx3bWAOPS\nSy/Vxo0bNWHChFbjzzzzjGw2m/vvuXPnasWKFdq0aZPmzZsnSVq+fLlSU1NVXFysgoICVVZWqry8\nXJWVlSoqKlK/fv20fv16OZ1O5eTkqLCwUAsXLlR2drYXpgkAAAAAADqyswYYwcHBuuyyy1qNHTp0\nSAEBAYqMjHSPHTt2TFFRUQoJCVF4eLiqqqpUWlqq5ORkSdKoUaO0devWVmMpKSkqKSnRgQMH1L9/\nf3Xu3Fk2m007d+709BwBAAAAAEAH1+bvwFi8eLFmz57daqypqcn979DQUFVXV6umpkYhISFtGpMk\nl8vVatsOh0MZGRnKyMhQRUVFW9sFAAAAAAAXgU5tOfPBgwclSX379m01Hhj4zxyktrZW4eHhCgsL\nU11dncLCwlRbW6s+ffrI6XSqrq7utOdrFhQU1Grb6enpSk9Pl3Tm34LtCPo+utFvtQ8/PdpvtQEA\nAAAA8IQ2rcAoLy/Xp59+qpSUFP31r3/VtGnTdPLkSfXq1UsHDx5UfX29qqur1aNHD8XFxSk/P1+S\nlJ+fr5iYmFZjeXl5stlsioqK0t69e9XQ0KDS0lINHDjQ87MEAAAAAAAd2jlXYKSmpqqsrEz79+/X\nfffdp6KiIklSZmamZs+erS5dumjRokXKzMyUy+XSggULJElTp07VpEmTtHLlSqWlpSkiIkIRERHq\n2bOn7Ha7IiMjNXv2bAUHB2vWrFlKSkpSly5dtGbNGu/OGAAAAAAAdDgBxhjj7ybOV0ZGhnJzc/3d\nRrtwCAkAAAAAAOd2pvf+bf4STwAAAAAAAF8jwAAAAAAAAJZHgAEAAAAAACyPAAMAAAAAAFgeAQYA\nAAAAALA8AgwAAAAAAGB5BBgAAAAAAMDyCDAAAAAAAIDlEWAAAAAAAADLI8AAAAAAAACWR4ABAAAA\nAAAsjwADAAAAAABYHgEGAAAAAACwPAIMAAAAAABgeQQYAAAAAADA8ggwAAAAAACA5Z01wKitrdWQ\nIUPUrVs37d69W/X19Ro+fLgSEhI0fPhwHTlyRJK0b98+JSQkKC4uTh988IEk6cSJExo/frzi4+P1\nzDPPuLeZlZUlu92uyZMnq7GxUZLkcDgUFxenESNG6OjRo96aKwAAAAAA6KDOGmBceuml2rhxoyZM\nmCBJCg4O1tq1a7VlyxZlZWXp2WeflSTNnTtXK1as0KZNmzRv3jxJ0vLly5Wamqri4mIVFBSosrJS\n5eXlqqysVFFRkfr166f169fL6XQqJydHhYWFWrhwobKzs708ZQAAAAAA0NGcNcAIDg7WZZdd5v67\nS5cuuuKKKyRJnTt3VmDg9xc/duyYoqKiFBISovDwcFVVVam0tFTJycmSpFGjRmnr1q2txlJSUlRS\nUqIDBw6of//+6ty5s2w2m3bu3OmViQIAAAAAgI6rXd+B0dDQoPnz52vmzJmSpKamJvdpoaGhqq6u\nVk1NjUJCQto0Jkkul6vdkwEAAAAAABenTu250L333qv7779fUVFRkuReiSF9/70Z4eHhCgsLU11d\nncLCwlRbW6s+ffrI6XSqrq7utOdrFhQU1KqWw+GQw+GQJFVUVLSnXQAAAAAA0MG1eQXGggUL9LOf\n/UwTJ050j/Xq1UsHDx5UfX29qqur1aNHD8XFxSk/P1+SlJ+fr5iYmFZjeXl5stlsioqK0t69e9XQ\n0KDS0lINHDiwVb309HTl5uYqNzdXvXv3vpC5AgAAAACADuqcKzBSU1NVVlam/fv3KzU1VdnZ2YqP\nj1dBQYFiY2P11FNPadGiRcrMzJTL5dKCBQskSVOnTtWkSZO0cuVKpaWlKSIiQhEREerZs6fsdrsi\nIyM1e/ZsBQcHa9asWUpKSlKXLl20Zs0ar08aAAAAAAB0LAHGGOPvJs5XRkaGcnNz/d1Gu/R9dKPf\nah9+erTfagMAAAAA0BZneu/fri/xBAAAAAAA8CUCDAAAAAAAYHkEGAAAAAAAwPIIMAAAAAAAgOWd\n81dI0PHxBaIAAAAAgI6OAANeRXgCAAAAAPAEDiEBAAAAAACWR4ABAAAAAAAsjwADAAAAAABYHgEG\nAAAAAACwPAIMAAAAAABgeQQYAAAAAADA8ggwAAAAAACA5RFgAAAAAAAAyyPAAAAAAAAAlkeAAQAA\nAAAALI8AAwAAAAAAWN5ZA4za2loNGTJE3bp10+7duyVJDodDcXFxGjFihI4ePSpJ2rdvnxISEhQX\nF6cPPvhAknTixAmNHz9e8fHxeuaZZ9zbzMrKkt1u1+TJk9XY2HjGbQIAAAAAADQ7a4Bx6aWXauPG\njZowYYIkyel0KicnR4WFhVq4cKGys7MlSXPnztWKFSu0adMmzZs3T5K0fPlypaamqri4WAUFBaqs\nrFR5ebkqKytVVFSkfv36af369WfcJgAAAAAAQLOzBhjBwcG67LLL3H8fOHBA/fv3V+fOnWWz2bRz\n505J0rFjxxQVFaWQkBCFh4erqqpKpaWlSk5OliSNGjVKW7dubTWWkpKikpKSM24TAAAAAACgWZu+\nA6OmpkYhISHuv10ulySpqanJPRYaGqrq6upW5z3fsZbbBAAAAAAAaNapLWcOCwtTXV2d+++goCBJ\nUmDgP3OQ2tpahYeHu88bFham2tpa9enTR06n0335U8936jabORwOORwOSVJFRUUbpwcAAAAAAC4G\nbVqBERUVpb1796qhoUGlpaUaOHCgJKlXr146ePCg6uvrVV1drR49eiguLk75+fmSpPz8fMXExLQa\ny8vLk81mO+M2m6Wnpys3N1e5ubnq3bu3J+YMAAAAAAA6mHOuwEhNTVVZWZn279+v++67T7NmzVJS\nUpK6dOmiNWvWSJIWLVqkzMxMuVwuLViwQJI0depUTZo0SStXrlRaWpoiIiIUERGhnj17ym63KzIy\nUrNnz1ZwcPBptwkAAAAAANAswBhj/N3E+crIyFBubq6/22iXvo9u9HcLPzqHnx7t7xYAAAAAAG10\npvf+bTqEBAAAAAAAwB8IMAAAAAAAgOURYAAAAAAAAMsjwAAAAAAAAJZHgAEAAAAAACyPAAMAAAAA\nAFgeAQYAAAAAALA8AgwAAAAAAGB5BBgAAAAAAMDyCDAAAAAAAIDlEWAAAAAAAADL6+TvBgBv6fvo\nRr/VPvz0aL/VBgAAAICLESswAAAAAACA5RFgAAAAAAAAyyPAAAAAAAAAlkeAAQAAAAAALI8AAwAA\nAAAAWF6bA4ympiZlZmbKbrcrPj5e+/btU3FxseLi4hQfH69du3ZJko4fP67k5GTZbDatXbtWkuRy\nuTRlyhTZ7XbNmjXLvc0XX3xRNptNY8eOVV1dnYemBgAAAAAALhZtDjDKysr03XffqaioSE899ZRy\ncnL0m9/8Rhs3btSf/vQnZWVlSZIWL16sOXPm6MMPP9SSJUt08uRJvfvuu7riiitUVFSkEydOaOvW\nraqqqtI777yj4uJiTZw4UUuWLPH4JAEAAAAAQMfW5gAjIiJCxhgZY1RTU6N/+Zd/UVBQkLp3767I\nyEhVV1dLkj7++GMNHz5cnTp10k033aTdu3ertLRUycnJkqSUlBSVlJTok08+UWJiogICAtxjAAAA\nAAAALXVq6wV69Oih4OBg9evXTydPnlRRUZF+/etf/3ODnTqpoaFBjY2NCgz8Ph8JDQ1VdXW1ampq\nFBIScs6xlhwOhxwOhySpoqKifbMEAAAAAAAdWpsDjPfff1+dOnXS/v37tW3bNj3yyCOtvrfC6XSq\nc+fOCg4OVlNTkwIDA1VbW6vw8HCFhYW5z9ty7LPPPms11lJ6errS09MlSRkZGe2eKAAAAAAA6Lja\nfAiJMUY/+clPJH2/GqO+vl5Op1NfffWVKioq3AFEdHS0CgsL5XQ6tX37dg0YMEBxcXHKz8+XJOXl\n5clmsyk6OlpbtmxpNQYAAAAAANBSm1dgjBo1SqtXr1ZiYqK+++475eTkyOl0KjU1VQEBAXrllVck\nSVlZWbrzzjv1+OOPa9q0aeratavS0tL09ttvy263a9CgQYqNjZUkjR49WjabTd27d9e6des8O0MA\nAAAAANDhBRhjjL+bOF8ZGRnKzc31dxvt0vfRjf5uAT50+OnR/m4BAAAAADqkM733b/MhJAAAAAAA\nAL5GgAEAAAAAACyPAAMAAAAAAFgeAQYAAAAAALA8AgwAAAAAAGB5BBgAAAAAAMDyCDAAAAAAAIDl\nEWAAAAAAAADLI8AAAAAAAACW18nfDQAXo76PbvRb7cNPj/ZbbQAAAADwFlZgAAAAAAAAyyPAAAAA\nAAAAlkeAAQAAAAAALI8AAwAAAAAAWB4BBgAAAAAAsDwCDAAAAAAAYHkEGAAAAAAAwPLaFWAUFhZq\nxIgRGjZsmN566y0VFxcrLi5O8fHx2rVrlyTp+PHjSk5Ols1m09q1ayVJLpdLU6ZMkd1u16xZs9zb\ne/HFF2Wz2TR27FjV1dV5YFoAAAAAAOBi0uYA49tvv9Xzzz+v9957T5s3b9a4ceP0m9/8Rhs3btSf\n/vQnZWVlSZIWL16sOXPm6MMPP9SSJUt08uRJvfvuu7riiitUVFSkEydOaOvWraqqqtI777yj4uJi\nTZw4UUuWLPH4JAEAAAAAQMfW5gBj69at6tq1q8aMGaNx48bp888/V1BQkLp3767IyEhVV1dLkj7+\n+GMNHz5cnTp10k033aTdu3ertLRUycnJkqSUlBSVlJTok08+UWJiogICAtxjAAAAAAAALXVq6wW+\n+OILffbZZ/roo4+Un5+vJ554QiEhIf/cYKdOamhoUGNjowIDv89HQkNDVV1drZqaGvd5zzbWksPh\nkMPhkCRVVFS0b5YAAAAAAKBDa/MKjLCwMNlsNnXu3FkjRozQjh07Wn1vhdPpVOfOnRUcHKympiZJ\nUm1trcLDwxUWFuY+79nGWkpPT1dubq5yc3PVu3fvdk8UAAAAAAB0XG0OMKKjo7V3714ZY1RWVqZ/\n/dd/ldPp1FdffaWKigp3ABEdHa3CwkI5nU5t375dAwYMUFxcnPLz8yVJeXl5stlsio6O1pYtW1qN\nAQAAAAAAtNTmQ0h69OihcePGub+3YuXKlaqsrFRqaqoCAgL0yiuvSJKysrJ055136vHHH9e0adPU\ntWtXpaWl6e2335bdbtegQYMUGxsrSRo9erRsNpu6d++udevWeXaGAAAAAACgwwswxhh/N3G+MjIy\nlJub6+822qXvoxv93QJ+JA4/PdrfLQAAAABAu53pvX+bDyEBAAAAAADwNQIMAAAAAABgeQQYAAAA\nAADA8ggwAAAAAACA5RFgAAAAAAAAyyPAAAAAAAAAlkeAAQAAAAAALI8AAwAAAAAAWB4BBgAAAAAA\nsDwCDAAAAAAAYHkEGAAAAAAAwPI6+bsBAJ7V99GNfqt9+OnRfqsNAAAA4OLGCgwAAAAAAGB5rMAA\n4DGs/gAAAADgLazAAAAAAAAAlkeAAQAAAAAALK/dAcbrr7+uyy67TJLkcDgUFxenESNG6OjRo5Kk\nffv2KSEhQXFxcfrggw8kSSdOnND48eMVHx+vZ555xr2trKws2e12TZ48WY2NjRcyHwAAAAAAcBFq\nV4DhcrnkcDjUu3dvOZ1O5eTkqLCwUAsXLlR2drYkae7cuVqxYoU2bdqkefPmSZKWL1+u1NRUFRcX\nq6CgQJWVlSovL1dlZaWKiorUr18/rV+/3nOzAwAAAAAAF4V2BRivv/660tPTFRgYqAMHDqh///7q\n3LmzbDabdu7cKUk6duyYoqKiFBISovDwcFVVVam0tFTJycmSpFGjRmnr1q2txlJSUlRSUuKhqQEA\nAAAAgItFmwMMl8ul3NxcTZw4UZJUU1OjkJCQVqdLUlNTk3ssNDRU1dXVrc57tjEAAAAAAICW2vwz\nqmvXrlVGRoYCA7/PPsLCwlRXV+c+PSgoSJLcp0tSbW2twsPD3ecNCwtTbW2t+vTpI6fT6b588/la\ncjgccjgckqSKioq2tgsAAAAAAC4CbV6BsWfPHr322mtKSUnRgQMH9NJLL2nv3r1qaGhQaWmpBg4c\nKEnq1auXDh48qPr6elVXV6tHjx6Ki4tTfn6+JCk/P18xMTGtxvLy8mSz2VrVS09PV25urnJzc9W7\nd+8LnS8AAAAAAOiA2rwCY/Hixe5/33TTTVq6dKneeOMNJSUlqUuXLlqzZo0kadGiRcrMzJTL5dKC\nBQskSVOnTtWkSZO0cuVKpaWlKSIiQhEREerZs6fsdrsiIyM1e/ZsD00NAAAAAABcLAKMMcbfTZyv\njIwM5ebm+ruNdun76EZ/twDASw4/PdrfLQAAAAAXjTO992/Xr5AAAAAAAAD4UpsPIQEAtObPFVas\n/gAAAMCPBQEGAHRghCcAAAD4seAQEgAAAAAAYHkEGAAAAAAAwPIIMAAAAAAAgOURYAAAAAAAAMsj\nwAAAAAAAAJbHr5AAANqFX0ABAACAL7ECAwAAAAAAWB4BBgAAAAAAsDwOIQEAdDgcvgIAAPDjwwoM\nAAAAAABgeQQYAAAAAADA8jiEBACANuDwFQAAAP9gBQYAAAAAALA8AgwAAAAAAGB5bT6E5OOPP9aD\nDz6o4OBgXXnllXrttdf09ttv64UXXlDXrl21Zs0aRUREaN++fbr33nvldDqVnZ2tESNG6MSJE5o8\nebL+93//V2PHjtWcOXMkSVlZWSotLVXfvn21cuVKBQcHe3yiAAB0dBy+AgAAfszavAKjd+/eKigo\n0JYtW9S3b19t2LBBOTk5Kiws1MKFC5WdnS1Jmjt3rlasWKFNmzZp3rx5kqTly5crNTVVxcXFKigo\nUGVlpcrLy1VZWamioiL169dP69ev9+wMAQAAAABAh9fmAKNXr17q2rWrJKlz587av3+/+vfvr86d\nO8tms2nnzp2SpGPHjikqKkohISEKDw9XVVWVSktLlZycLEkaNWqUtm7d2mosJSVFJSUlnpobAAAA\nAAC4SLT7V0iOHDmi999/X08//bS+/PJL97jL5ZIkNTU1ucdCQ0NVXV2tmpoahYSE/GCsV69ercZa\ncjgccjgckqSKior2tgsAAC4Ah68AAAB/a1eAUVdXp8mTJ2v16tVyuVyqq6tznxYUFCRJCgz85+KO\n2tpahYeHKywsTHV1dQoLC1Ntba369Okjp9Ppvnzz+VpKT09Xenq6JCkjI6M97QIAAAAAgA6uzQGG\n0+nUbbfdpieeeELXXnutGhsbtXfvXjU0NGjbtm0aOHCgpO8PNTl48KB++tOfqrq6Wj169FBcXJzy\n8/M1ZcoU5efn6w9/+IOqqqqUk5OjO++8U3l5ebLZbB6fJAAA6LhY/QEAAKR2BBivv/66/va3vyk7\nO1vZ2dmaPn26Zs2apaSkJHXp0kVr1qyRJC1atEiZmZlyuVxasGCBJGnq1KmaNGmSVq5cqbS0NEVE\nRCgiIkI9e/aU3W5XZGSkZs+e7dkZAgAAtBPhCQAA1hFgjDH+buJ8ZWRkKDc3199ttIs/XwABAAC0\nBeEJAMCfzvTev91f4gkAAICLEytPAABW1OafUQUAAAAAAPA1VmAAAADAMlj9AQA4E1ZgAAAAAAAA\ny2MFBgAAAKAf75eus/IEQEdBgAEAAAD8iP1Ygxt/IjQC2ocAAwAAAAB86McaGhHc4EIRYAAAAAAA\nvO7HGtz408UWGvElngAAAAAAwPIIMAAAAAAAgOURYAAAAAAAAMsjwAAAAAAAAJZHgAEAAAAAACyP\nAAMAAAAAAFgeAQYAAAAAALA8AgwAAAAAAGB5BBgAAAAAAMDyLBNgZGVlyW63a/LkyWpsbPR3OwAA\nAAAAwEIsEWCUl5ersrJSRUVF6tevn9avX+/vlgAAAAAAgIVYIsAoLS1VcnKyJCklJUUlJSV+7ggA\nAAAAAFhJJ383IEk1NTXq1auXJCk0NFTV1dXu0xwOhxwOhyRp27ZtysjI8EuPF2qIH2tXVFSod+/e\n1KY2talNbWpTm9rUpja1qU3tH1Ht2Nj/9FvtC3Hw4MHTn2AsYMmSJWbNmjXGGGO2bdtmZsyY4eeO\nLi7p6enUpja1qU1talOb2tSmNrWpTW1qd2iWOIQkLi5O+fn5kqS8vDzZbDY/dwQAAAAAAKwkaP78\n+fP93cTll1+u0tJSZWdnq6GhQY899piCgoL83dZFZcCAAdSmNrWpTW1qU5va1KY2talNbWp3WAHG\nGOPvJgAAAAAAAM7GEoeQAAAAAAAAnA0BBjxu9erVamhokCTNnz9f7777rp87Ai5OLR9rP3ZlZWUa\nMmSIHnnkEX+34lWzZs3St99+6+82LGn37t3KzMz02vZffvllrV692mvbPx1/3N55eXmKjo7Ws88+\n6/VaZWVlWrp0qdfrtNdNN93k1e2f6/XS8ePH9cQTT3i1h6SkJH399dderXEmhw8f1vvvv++X2lbh\n7f2Wv7X3PcGyZcs81sOmTZv01ltveWx77VVWVqaPP/5Y0vf3/QkTJrR5G7Nnz1ZhYWG76jc/n3zz\nzTdKSkrSyJEj27UdKyDAgMfxpgpWceLECZ/UuO2227xe53R4rP3Te++9p8cee0zPP/+8v1vxqt/9\n7nfq2rWrv9uAj/jj9n7zzTe1bNky/cd//MdZz2eM0YUehfxv//Zvmj59+gVtoyM71z788ssv14IF\nC3zYkW8RYFz82vs6xZMBRkpKisaNG+ex7bVXywDDH5qfT8rLy3XDDTe4f0CjIyLAgEdt3bpVZWVl\nuuWWW5STkyNJeuONN5SamqrExET3J0lPPvmkEhMTlZCQoF27dvmzZY/66KOPNHToUA0bNky+/H7c\nxsZGy7yRffbZZ5WUlKQbb7xRf/3rX31e/9tvv9Uf//hHpaSk6NVXX/V6vYKCAg0fPtzrdU516mPt\n6NGjGjlypBISEvTAAw/4vB9f2rx5s2JiYhQTE6PXXntNe/bs0e9//3vNmzfPoy96TlVYWKjk5GSN\nGTNG0dHRftl3NX9aumHDBg0ZMkTDhg3z+SfYDQ0Namxs9GqN0+1HMjMzNW3aNI0aNUq//OUvZYyR\n0+lURkaGRo4cqRdeeKFNNYwxmjFjhux2u4YNG6aioiLFx8fLZrPpqaeekiRVVFTIbrfrlltuafVi\nz1fPYc239+rVq3Xrrbe673uff/65V+oVFBRow4YNuvfee/XOO+/84LEmfX87zJgxQ8nJyaqqqrqg\neoWFhZo9e7ZuvPFGPfDAAxo6dKgWL14sSfrHP/4hm82m1NRU3XbbbV5Z/WKM0cyZMzVs2DCNHDlS\nR48e1dNPP63Y2Fjde++9ampq8njNZufzeqnlp7R333237Ha7kpKSdPjwYY/28thjjykhIUEPPvig\nJOnkyZOaNGmShg8frrFjx6qurs6j9ZotXbpUb7zxhpKSklRdXe2VGlZ0IfutC3W++1ZPOJ/7eFNT\nk0aOHKnExESNGjVKdXV1Wrp0qfbv36+kpCQVFBS0qeb777+vQYMGKT09XQkJCTp8+LBWr16tl19+\nWevXr3fvX77++mv3a7fVq1fLbrcrLi7OXS8pKUkPP/zwGV9TNZ8eExOj+fPna+bMmbrpppv0u9/9\nTpL097//XTfffLOSkpL00EMPSfr+/v7iiy8qOTlZkvT5559r4sSJuv766911T7fPLS8vV3R0tNLS\n0rRz5842XR+n9vz111/rwQcf1Jtvvqn777+/3dvyO//9gisuVomJiaa+vt4YY8wTTzxhFixYYIwx\nZs6cOWbDhg1m165d5s477zTGGFNZWWnGjh3rt1497fHHHzcbN240xhjjcrl8VvfLL780NpvNPPTQ\nQ2b37t0+q3s6J06cMMYY88UXX5iEhASf1d21a5eZPn26GTlypFmyZImprq72Sd3777/fHDlyxCe1\nTtXysTZjxgzz3nvvGWOMmTJlivnwww/90pMvDB061Hz55ZemoaHBDB482HzzzTfmiSeeMH/5y1+8\nWnfz5s3GZrOZpqYms2fPHjNmzBiv1jud5tt80qRJ5tNPPzXG+G5fs3v3bvPQQw8Zm81mvvzyS6/W\nOt1+5K677jJr1qwxxhiTkZFhysvLjcPhMI899pgxxpilS5eau+6667xrbNiwwTzwwAPuv9PS0sye\nPXtMU1OTGTVqlDl06JCZMWOGycvLM8YYM3HiRLNq1SqfPoc1396rVq0yd999tzHGmFdeecW8+OKL\nXqt51113mV27dhljTv9Yu+uuu8zy5cs9Umvz5s3mkUceMVdddZU5fPiwcTqdZsCAAcYY0+q6v/32\n282qVas8UrOlv/zlL+a3v/2tMcaYjz76yEybNs0kJCS4H+N9+/b1eM2WzvV66dChQ+bWW281DQ0N\nJjY21jQ1NRljPPuYT0xMbHUf3759u3nppZfMihUrjDHG/PnPfzbPPvusx+q11Hz7/9hcyH7rQp3v\nvtVTznUfb9lTTk6OWbZsmTHGmMGDB7er3tChQ83//d//mZMnT5q+ffuaQ4cOmVWrVpmXXnrJfPPN\nN+45r1u3zrzwwgumqqrK3J254DUAAAj1SURBVHzzzaapqcl8/fXXJjEx0d138+uomJgY89VXX/1g\nXsXFxcblcpkrr7zS7NixwzQ2Npobb7zRGGNMenq6+eyzz4wxxkybNs188skn7j6MMebQoUPm5z//\nuWlsbDR79uwx48aNc/d/6j43LS3N7Nu3z7hcLhMbG2s2b97cruum+ba4GB53nfwdoODiN2jQIElS\n7969VVNToz179qi0tFRJSUmSdFH9ZO6MGTP0n//5n1q3bp1+9atfKTU11Sd1e/TooeLiYpWWlurF\nF1/UkSNHNGHCBGVmZio4ONgnPTT74x//qHXr1ikwMNBrnxKezubNm1VSUqJZs2YpPT1d3bp180nd\nf/zjH4qMjPRJrbP57LPPFB0dLUmKjo7WgQMHlJCQ4OeuvMPlcqlHjx6SpGuuuUbHjh3zWe1BgwYp\nICBA/fv39+n9+1S//e1v9dxzz+nbb7/VjBkzFBMT45U6jY2NWr16tdavX68+ffro7rvvdn+S5k1n\n2o+c+nzy2WefafDgwZK+v99/9NFH511j7969SkxMdP99/Phx9e/fX5J044036uDBgz/YviS/PYe1\nnPv27dt9UvNMj7Xm68JTunfvrj59+kiSunTpIkmtrvvm/3vanj179NZbb2nLli0yxig4OFgDBw50\nP8Z99TzS7NT7d7Pg4GDNmDFDkydP1k9+8hMtWrTIo721vI8fOHBAe/bs0SeffKLXXntNjY2Nstvt\nHqtlJTk5OXrnnXc0evTocx4y5UkXst+6UOe7b/WWU+t8/fXXuu+++3T06FFVV1e363shWnK5XAoP\nD5ckXXfdda1O69q1qyIjI/U///M/Wr9+vV5++WUdPHhQn376qYYNGyZJ+vLLL3/Q65VXXqmvvvpK\noaGhrbY3cOBABQYG6vLLL9cNN9yggIAA92vuffv26Z577pEk1dfX6+abb/5Br9ddd506derU6jo/\n3T73+PHjuvbaayV5b1/Y0XAICTwuODhYLpfL/XdAQID738YY9evXT4mJiSosLFRhYaE2bdrkjza9\nIjQ0VC+//LJWrVqlrKwsn9ePi4vT9OnTFRkZqVdffVX19fU+7+Gll17S5s2b9cYbb3hsGeL5mDlz\npkpKStTU1KRx48YpMzPT6y/y9+/fr5///OderXE2LR9r11xzjfvYyk8++URRUVFer3/06FGv1zid\nwMBAVVVVqbGxUQcOHNAVV1zhs9plZWUyxmj//v3q1auXz+qeqnfv3lq2bJkWL16suXPneq1OfX29\nXn31VUVGRmr69OmKjY31Wq2WzrQfOfX55JprrtGOHTskSdu2bWtTjf79+2vLli3uvy+77DLt3btX\nxhj993//t66++urTbt9fz2Gnzt0XzvRYCwz07MvHlnNr1vK6b/6/p/Xr108ZGRkqLCzUhx9+qFWr\nVmnXrl3ux7i3v9zyXK+XmrlcLmVkZGjt2rXq2bOn3nzzTY/20fI+fs0116hfv3769a9/rcLCQpWU\nlCg7O9uj9ZqdOn9fe/jhh1VYWOjT8ELSBe23LtT57ls95Vz38by8PF111VX68MMPlZmZ6a59un3C\n+QgKClJNTY0aGhr06aef/uD0iRMnatmyZfrmm290xRVX6Gc/+5kGDhyozZs3q7CwUGVlZWfs9VQt\nTz+132uvvVZr1qxRYWGhtm3bprS0tPN6vJ9un9uzZ08dOHDA/dwEiRUY8LixY8cqIyNDt95662lP\nHzhwoKKiopSYmKjAwECNGjXK4y/Ajx8/rqVLl/r8y69+//vf680335TT6fTpt0p/9913Wrp0qTZs\n2KABAwbo/vvvdyfHvhYfH6/4+HjFxMT4/NOrbt266Z577tE999yjvXv36u9//7tX623atEm33HKL\nV2ucTcvHWlZWlu666y49+eSTuu6667y++sLpdOr2229XUVGRV+uczpNPPqnRo0crICBADzzwgE+/\n5DA0NFRjxozRF198oRUrVvis7qkWLFigrVu3qqGhQTNnzvRanfDwcG3fvl07duzQihUr9Omnn+oX\nv/iFpk+frksuucRrdc93P/LLX/5Sf/7znzVixIg2h4ljxozRpk2bFB8fr+DgYM2fP19Tp06VMUaj\nR49W3759NWfOHN1xxx167rnnFBISIsk3z2FW4c/H2pw5c3T77bfr+eefV9euXb2ymnDMmDEqKCjQ\nsGHDFBAQoF/96ldKTk5WbGysBg8erO7du3u8Zkvner3UrL6+Xr/4xS8UEBCggIAArVu3zqN9vPfe\ne1q4cKFuuOEGDR48WAMGDNC9996rVatWSZIeeeQRjR492qM1Jen666/XY489pvT0dP3hD39QWFiY\nx2tY0YXsty6Ur1+jnes+HhMToyeffFI7duxQz5493Star732Wt166616+OGHZbPZzrvewoULNWLE\nCF111VW6/PLLf7DfuPnmmzVlyhQtXLhQ0vcrmG+77TYlJiYqKChI119/vf7rv/6rnbP9p8WLF2va\ntGk6efKkgoKCtHLlSsXGxurOO+/U3/72Nz355JOnvdzp9rnZ2dm644479NOf/tTr+6SOIsD48iNS\n+IzT6TzteGBgoMc+OfFFDZyf6upqbdq0SePHj3cvv/U2f93+VrrfTZgwQevWrfPqGznJWnNu9vHH\nH6u8vFz//u//7rUaVpt3YWGh3n33XT333HNer2W1uTc7efKk3nzzTaWkpLiX6V4Iq87T1/xxPVj9\nunc6nerU6fvP2e644w49+OCDGjp0qJ+7ah8rXNdW6AG+4+vb2yr1XC6XLrnkEn333XeKjo7Wjh07\nLugwv474uOmIPbfVxTELtHL06FEFBwef9r/mxLEj1MD5Cw8P1x133OGz8MJft7/V7nfjx4/3enhh\ntTk3GzJkiFfDC6vO2xesPPcuXbrojjvu8Eh4YeV5+pI/roeOcN0fOXJEdrtdsbGxCgkJ6bDhhRWu\nayv0AN/x9e1tpXo33nijkpKSFBsbq1mzZl1QeNERHzcdsef2YAUGAAAAAACwPFZgAAAAAAAAyyPA\nAAAAAAAAlkeAAQAAAAAALI8AAwAAAAAAWB4BBgAAAAAAsLz/Byf5uN9NkeDaAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 1320x240 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "an5Q1dNIuU6q",
        "colab_type": "text"
      },
      "source": [
        "Plotting your frequencies as a bar chart you start to see a nice picture of your data. Not that suprising, but the most common token happens to be the period and some other common syntactical tokens like curly braces and also key words like *if* and *return*."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7N56xK19CYPJ",
        "colab_type": "code",
        "outputId": "fadb9f6f-60ad-4eca-8e83-79f26c83d287",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 812
        }
      },
      "source": [
        "#collapse_show\n",
        "def plot_hist(lens, n_bins = 50):\n",
        "    n, bins, patches = plt.hist(lens, n_bins, facecolor='blue', alpha=0.9)\n",
        "    plt.show()\n",
        "\n",
        "print(mean(code_lens), median(code_lens), stdev(code_lens))\n",
        "plot_hist(code_lens)\n",
        "print(mean(locs), median(locs), stdev(locs))\n",
        "plot_hist(locs)\n",
        "print(mean(comment_lens), median(comment_lens), stdev(comment_lens))\n",
        "plot_hist(comment_lens)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "3.4662519750610943 3.0 2.6490695431339177\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAD4CAYAAADCb7BPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAUDUlEQVR4nO3dbYyd5X3n8e+vOCQsbbAJXgvbaM0q\nViIqNQRG4ChR1RIVDFvFvIgioqpYkRW/CFklUqUWdqVFTfqCvCkNUoqEAsVU2RCWNouFkrheB2ml\nlXgYB8KTw3qaBGEb8ATzsC1SsqT/fXGuWQ7TsX3msjlzxv5+pKNz3//7us/1n9Gxfr4fzplUFZIk\nLdZvLHUDkqTlyQCRJHUxQCRJXQwQSVIXA0SS1GXFUjcwLuedd15t2LBhqduQpGVl7969v6iq1Qtt\nO20CZMOGDUxPTy91G5K0rCR5/mjbPIUlSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaI\nJKmLASJJ6nLafBL9RKxdu3D90KHx9iFJk2SkI5AkK5Pcn+QnSfYl+ViSc5PsTrK/Pa9qY5PktiQz\nSZ5McsnQ62xt4/cn2TpUvzTJU22f25Kk1Rc9hyRpPEY9hfV14AdV9WHgI8A+4EZgT1VtBPa0dYCr\ngY3tsR24HQZhANwMXA5cBtw8FwhtzOeH9tvc6ouaQ5I0PscNkCTnAL8L3AlQVb+qqteALcCONmwH\ncG1b3gLcUwMPAyuTnA9cBeyuqiNV9SqwG9jctr2/qh6uwR9ov2feay1mDknSmIxyBHIhMAv8TZLH\nk3wzydnAmqp6sY15CVjTltcBLwztf6DVjlU/sECdjjneIcn2JNNJpmdnZ0f4USVJoxolQFYAlwC3\nV9VHgX/m7VNJALQjhzr57Z3YHFV1R1VNVdXU6tULfp29JKnTKAFyADhQVY+09fsZBMrLc6eN2vPh\ntv0gcMHQ/utb7Vj19QvU6ZhDkjQmxw2QqnoJeCHJh1rpk8CzwE5g7k6qrcADbXkncH27U2oT8Ho7\nDbULuDLJqnbx/EpgV9v2RpJN7e6r6+e91mLmkCSNyaifA/mPwLeSnAn8FPgcg/C5L8k24HngM23s\n94BrgBngzTaWqjqS5KvAY23cV6rqSFv+AnA3cBbw/fYAuGUxc0iSxieDSwunvqmpqer9k7Z+kFDS\n6SrJ3qqaWmibX2UiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6\nGCCSpC4GiCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6\nGCCSpC4GiCSpiwEiSeoyUoAk+XmSp5I8kWS61c5NsjvJ/va8qtWT5LYkM0meTHLJ0OtsbeP3J9k6\nVL+0vf5M2ze9c0iSxmMxRyC/X1UXV9VUW78R2FNVG4E9bR3gamBje2wHbodBGAA3A5cDlwE3zwVC\nG/P5of0298whSRqfEzmFtQXY0ZZ3ANcO1e+pgYeBlUnOB64CdlfVkap6FdgNbG7b3l9VD1dVAffM\ne63FzCFJGpNRA6SAf0iyN8n2VltTVS+25ZeANW15HfDC0L4HWu1Y9QML1HvmeIck25NMJ5menZ0d\n6QeVJI1mxYjjPlFVB5P8W2B3kp8Mb6yqSlInv70Tm6Oq7gDuAJiamnpX+5Ok081IRyBVdbA9Hwa+\ny+Aaxstzp43a8+E2/CBwwdDu61vtWPX1C9TpmEOSNCbHDZAkZyf5rbll4ErgaWAnMHcn1Vbggba8\nE7i+3Sm1CXi9nYbaBVyZZFW7eH4lsKtteyPJpnb31fXzXmsxc0iSxmSUU1hrgO+2O2tXAP+1qn6Q\n5DHgviTbgOeBz7Tx3wOuAWaAN4HPAVTVkSRfBR5r475SVUfa8heAu4GzgO+3B8Ati5lDkjQ+Gdz4\ndOqbmpqq6enprn3Xrl24fujQCTQkSctAkr1DH994Bz+JLknqYoBIkroYIJKkLgaIJKmLASJJ6mKA\nSJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKA\nSJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqMnKAJDkjyeNJHmzrFyZ5JMlMku8k\nObPV39vWZ9r2DUOvcVOrP5fkqqH65labSXLjUH3Rc0iSxmMxRyBfAvYNrX8NuLWqPgi8Cmxr9W3A\nq61+axtHkouA64DfBjYDf91C6QzgG8DVwEXAZ9vYRc8hSRqfkQIkyXrgPwDfbOsBrgDub0N2ANe2\n5S1tnbb9k238FuDeqvplVf0MmAEua4+ZqvppVf0KuBfY0jmHJGlMRj0C+SvgT4F/aesfAF6rqrfa\n+gFgXVteB7wA0La/3sb///q8fY5W75njHZJsTzKdZHp2dnbEH1WSNIrjBkiSPwQOV9XeMfRzUlXV\nHVU1VVVTq1evXup2JOmUsmKEMR8HPpXkGuB9wPuBrwMrk6xoRwDrgYNt/EHgAuBAkhXAOcArQ/U5\nw/ssVH+lYw5J0pgc9wikqm6qqvVVtYHBRfAfVtUfAQ8Bn27DtgIPtOWdbZ22/YdVVa1+XbuD6kJg\nI/Ao8Biwsd1xdWabY2fbZ7FzSJLGZJQjkKP5M+DeJH8BPA7c2ep3An+bZAY4wiAQqKpnktwHPAu8\nBdxQVb8GSPJFYBdwBnBXVT3TM4ckaXxyuvzHfWpqqqanp7v2Xbt24fqhQyfQkCQtA0n2VtXUQtv8\nJLokqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQu\nBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQu\nxw2QJO9L8miSHyd5Jsmft/qFSR5JMpPkO0nObPX3tvWZtn3D0Gvd1OrPJblqqL651WaS3DhUX/Qc\nkqTxGOUI5JfAFVX1EeBiYHOSTcDXgFur6oPAq8C2Nn4b8Gqr39rGkeQi4Drgt4HNwF8nOSPJGcA3\ngKuBi4DPtrEsdg5J0vgcN0Bq4J/a6nvao4ArgPtbfQdwbVve0tZp2z+ZJK1+b1X9sqp+BswAl7XH\nTFX9tKp+BdwLbGn7LHYOSdKYjHQNpB0pPAEcBnYD/wi8VlVvtSEHgHVteR3wAkDb/jrwgeH6vH2O\nVv9AxxySpDEZKUCq6tdVdTGwnsERw4ff1a5OkiTbk0wnmZ6dnV3qdiTplLKou7Cq6jXgIeBjwMok\nK9qm9cDBtnwQuACgbT8HeGW4Pm+fo9Vf6Zhjfr93VNVUVU2tXr16MT+qJOk4RrkLa3WSlW35LOAP\ngH0MguTTbdhW4IG2vLOt07b/sKqq1a9rd1BdCGwEHgUeAza2O67OZHChfWfbZ7FzSJLGZMXxh3A+\nsKPdLfUbwH1V9WCSZ4F7k/wF8DhwZxt/J/C3SWaAIwwCgap6Jsl9wLPAW8ANVfVrgCRfBHYBZwB3\nVdUz7bX+bDFzSJLGJ6fLf9ynpqZqenq6a9+1axeuHzp0Ag1J0jKQZG9VTS20zU+iS5K6GCCSpC4G\niCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4G\niCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKnLiqVu4FS0du3Rtx06NL4+\nJOnd5BGIJKnLcQMkyQVJHkrybJJnknyp1c9NsjvJ/va8qtWT5LYkM0meTHLJ0GttbeP3J9k6VL80\nyVNtn9uSpHcOSdJ4jHIE8hbwJ1V1EbAJuCHJRcCNwJ6q2gjsaesAVwMb22M7cDsMwgC4GbgcuAy4\neS4Q2pjPD+23udUXNYckaXyOGyBV9WJV/agt/x9gH7AO2ALsaMN2ANe25S3APTXwMLAyyfnAVcDu\nqjpSVa8Cu4HNbdv7q+rhqirgnnmvtZg5JEljsqhrIEk2AB8FHgHWVNWLbdNLwJq2vA54YWi3A612\nrPqBBep0zDG/3+1JppNMz87OjvZDSpJGMnKAJPlN4O+AL1fVG8Pb2pFDneTe3qFnjqq6o6qmqmpq\n9erV71JnknR6GilAkryHQXh8q6r+vpVfnjtt1J4Pt/pB4IKh3de32rHq6xeo98whSRqTUe7CCnAn\nsK+q/nJo005g7k6qrcADQ/Xr251Sm4DX22moXcCVSVa1i+dXArvatjeSbGpzXT/vtRYzhyRpTEb5\nIOHHgT8GnkryRKv9J+AW4L4k24Dngc+0bd8DrgFmgDeBzwFU1ZEkXwUea+O+UlVH2vIXgLuBs4Dv\ntweLnUOSND4ZXFo49U1NTdX09HTXvkf7ZPnRPlXuJ9ElnSqS7K2qqYW2+Ul0SVIXA0SS1MUAkSR1\nMUAkSV0MEElSFwNEktTFAJEkdTFAJEldDBBJUhcDRJLUxQCRJHUZ5csUdRTH+s4rSTrVeQQiSepi\ngEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepy\n3ABJcleSw0meHqqdm2R3kv3teVWrJ8ltSWaSPJnkkqF9trbx+5NsHapfmuSpts9tSdI7hyRpfEY5\nArkb2DyvdiOwp6o2AnvaOsDVwMb22A7cDoMwAG4GLgcuA26eC4Q25vND+23umUOSNF7HDZCq+p/A\nkXnlLcCOtrwDuHaofk8NPAysTHI+cBWwu6qOVNWrwG5gc9v2/qp6uKoKuGfeay1mDknSGPVeA1lT\nVS+25ZeANW15HfDC0LgDrXas+oEF6j1z/CtJtieZTjI9Ozs74o8mSRrFCV9Eb0cOdRJ6OelzVNUd\nVTVVVVOrV69+FzqTpNNXb4C8PHfaqD0fbvWDwAVD49a32rHq6xeo98whSRqj3gDZCczdSbUVeGCo\nfn27U2oT8Ho7DbULuDLJqnbx/EpgV9v2RpJN7e6r6+e91mLmkCSN0YrjDUjybeD3gPOSHGBwN9Ut\nwH1JtgHPA59pw78HXAPMAG8CnwOoqiNJvgo81sZ9parmLsx/gcGdXmcB328PFjuHJGm8Mri8cOqb\nmpqq6enprn3Xrj15fRw6dPJeS5LebUn2VtXUQtv8JLokqctxT2Hp5Frs0YxHLJImlUcgkqQuBogk\nqYsBIknqYoBIkrp4EX3CHe2iuxfXJS01j0AkSV0MEElSFwNEktTFAJEkdTFAJEldDBBJUhcDRJLU\nxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0MEElSFwNEktTFvweyTPl3QiQtNY9AJEldDBBJ\nUpdleworyWbg68AZwDer6pYlbmkiHO3U1tF4yktSr2V5BJLkDOAbwNXARcBnk1y0tF1J0ulluR6B\nXAbMVNVPAZLcC2wBnl3SrpahxR6xHI1HMtLpZ7kGyDrghaH1A8Dl8wcl2Q5sb6v/lOS5EV//POAX\nJ9Th+C1pz8mid/F3PB72/O5bbv3C4nr+d0fbsFwDZCRVdQdwx2L3SzJdVVPvQkvvmuXW83LrF+x5\nXJZbz8utXzh5PS/LayDAQeCCofX1rSZJGpPlGiCPARuTXJjkTOA6YOcS9yRJp5VleQqrqt5K8kVg\nF4PbeO+qqmdO4hSLPu01AZZbz8utX7DncVluPS+3fuEk9ZyqOhmvI0k6zSzXU1iSpCVmgEiSuhgg\nQ5JsTvJckpkkNy51PwtJcleSw0meHqqdm2R3kv3tedVS9jhfkguSPJTk2STPJPlSq09s30nel+TR\nJD9uPf95q1+Y5JH2HvlOu4ljYiQ5I8njSR5s65Pe78+TPJXkiSTTrTax7wuAJCuT3J/kJ0n2JfnY\nJPec5EPt9zv3eCPJl09GzwZIs4y+HuVuYPO82o3AnqraCOxp65PkLeBPquoiYBNwQ/vdTnLfvwSu\nqKqPABcDm5NsAr4G3FpVHwReBbYtYY8L+RKwb2h90vsF+P2qunjocwmT/L6AwXfw/aCqPgx8hMHv\ne2J7rqrn2u/3YuBS4E3gu5yMnqvKx+BGgo8Bu4bWbwJuWuq+jtLrBuDpofXngPPb8vnAc0vd43H6\nfwD4g+XSN/BvgB8x+LaDXwArFnrPLPWDweeh9gBXAA8CmeR+W08/B86bV5vY9wVwDvAz2g1Iy6Hn\neX1eCfyvk9WzRyBvW+jrUdYtUS+LtaaqXmzLLwFrlrKZY0myAfgo8AgT3nc7HfQEcBjYDfwj8FpV\nvdWGTNp75K+APwX+pa1/gMnuF6CAf0iyt331EEz2++JCYBb4m3aq8JtJzmayex52HfDttnzCPRsg\np5ga/HdiIu/NTvKbwN8BX66qN4a3TWLfVfXrGhz2r2fwBZ4fXuKWjirJHwKHq2rvUveySJ+oqksY\nnDq+IcnvDm+cwPfFCuAS4Paq+ijwz8w79TOBPQPQrn99Cvhv87f19myAvG05fz3Ky0nOB2jPh5e4\nn38lyXsYhMe3qurvW3ni+waoqteAhxicAlqZZO4DuJP0Hvk48KkkPwfuZXAa6+tMbr8AVNXB9nyY\nwXn5y5js98UB4EBVPdLW72cQKJPc85yrgR9V1ctt/YR7NkDetpy/HmUnsLUtb2VwjWFiJAlwJ7Cv\nqv5yaNPE9p1kdZKVbfksBtds9jEIkk+3YRPTc1XdVFXrq2oDg/fuD6vqj5jQfgGSnJ3kt+aWGZyf\nf5oJfl9U1UvAC0k+1EqfZPBnJCa25yGf5e3TV3Ayel7qizqT9ACuAf43g3Pd/3mp+zlKj98GXgT+\nL4P/DW1jcK57D7Af+B/AuUvd57yeP8Hg8PhJ4In2uGaS+wZ+B3i89fw08F9a/d8DjwIzDE4FvHep\ne12g998DHpz0fltvP26PZ+b+zU3y+6L1dzEw3d4b/x1YtQx6Pht4BThnqHbCPftVJpKkLp7CkiR1\nMUAkSV0MEElSFwNEktTFAJEkdTFAJEldDBBJUpf/BxrCEguPRLjZAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "18.54957629991187 10 50.99032748692644\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAD4CAYAAADsKpHdAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAVGElEQVR4nO3db4ydZ3nn8e+vMYEsbbBNvJb/RGuj\nWlTpi4bkKHFEVXVhcZy0wnmBUKJq7c1m8WoDK9hdqZtsX0SFvoDVqhRradqIUBxECWkKGytK6vWa\nSPsqIeNC85esJ6HZ2E7iAefPtkjQ0GtfnMtwcMb2sT2eGc98P9LRuZ/rvp/n3Pc8Jr85z3nOkKpC\nkqRfmOsJSJLmBwNBkgQYCJKkZiBIkgADQZLUlsz1BE7XRRddVOvWrZvraUjSOWPfvn3fr6oVx+s/\nZwNh3bp1TExMzPU0JOmckeT5E/V7yUiSBBgIkqRmIEiSAANBktQMBEkSYCBIkpqBIEkCDARJUjMQ\nJEnAOfxN5TOxevX09UOHZncekjSf+A5BkgQYCJKkZiBIkgADQZLUDARJEjBGICR5d5LvjDxeT/KJ\nJMuT7Emyv5+X9fgk2ZFkMsljSS4bOda2Hr8/ybaR+uVJHu99diTJ2VmuJOl4ThoIVfVMVV1aVZcC\nlwM/BL4B3ALsraoNwN7eBrgG2NCP7cDtAEmWA7cBVwJXALcdDZEe85GR/TbPyOokSWM71UtG7wee\nrarngS3Azq7vBK7r9hbgrhp6GFiaZBVwNbCnqo5U1SvAHmBz911YVQ9XVQF3jRxLkjRLTjUQrge+\n2u2VVfVit18CVnZ7DfDCyD4Hunai+oFp6pKkWTR2ICQ5H/gg8BfH9vVv9jWD8zreHLYnmUgyMTU1\ndbZfTpIWlVN5h3AN8NdV9XJvv9yXe+jnw10/CFw8st/arp2ovnaa+ptU1R1VNaiqwYoVK05h6pKk\nkzmVQLiBn10uAtgFHL1TaBtw30h9a99ttBF4rS8t7QY2JVnWHyZvAnZ33+tJNvbdRVtHjiVJmiVj\n/XG7JG8HPgD825Hyp4F7ktwEPA98uOsPANcCkwzvSLoRoKqOJPkU8GiP+2RVHen2zcCXgAuAB/sh\nSZpFGV7+P/cMBoOamJg4rX39a6eSFqMk+6pqcLx+v6ksSQIMBElSMxAkSYCBIElqBoIkCTAQJEnN\nQJAkAQaCJKkZCJIkwECQJDUDQZIEGAiSpGYgSJIAA0GS1AwESRJgIEiSmoEgSQIMBElSMxAkSYCB\nIElqYwVCkqVJ7k3y3SRPJ7kqyfIke5Ls7+dlPTZJdiSZTPJYkstGjrOtx+9Psm2kfnmSx3ufHUky\n80uVJJ3IuO8QPgf8VVX9CvBrwNPALcDeqtoA7O1tgGuADf3YDtwOkGQ5cBtwJXAFcNvREOkxHxnZ\nb/OZLUuSdKpOGghJ3gH8BnAnQFX9uKpeBbYAO3vYTuC6bm8B7qqhh4GlSVYBVwN7qupIVb0C7AE2\nd9+FVfVwVRVw18ixJEmzZJx3COuBKeDPknw7yReSvB1YWVUv9piXgJXdXgO8MLL/ga6dqH5gmvqb\nJNmeZCLJxNTU1BhTlySNa5xAWAJcBtxeVe8B/p6fXR4CoH+zr5mf3s+rqjuqalBVgxUrVpztl5Ok\nRWWcQDgAHKiqR3r7XoYB8XJf7qGfD3f/QeDikf3Xdu1E9bXT1CVJs+ikgVBVLwEvJHl3l94PPAXs\nAo7eKbQNuK/bu4CtfbfRRuC1vrS0G9iUZFl/mLwJ2N19ryfZ2HcXbR05liRpliwZc9y/B76S5Hzg\nOeBGhmFyT5KbgOeBD/fYB4BrgUnghz2WqjqS5FPAoz3uk1V1pNs3A18CLgAe7IckaRZlePn/3DMY\nDGpiYuK09l29evr6oUNnMCFJmueS7KuqwfH6/aayJAkwECRJzUCQJAEGgiSpGQiSJMBAkCQ1A0GS\nBBgIkqRmIEiSAANBktQMBEkSYCBIkpqBIEkCDARJUjMQJEmAgSBJagaCJAkwECRJzUCQJAFjBkKS\nv03yeJLvJJno2vIke5Ls7+dlXU+SHUkmkzyW5LKR42zr8fuTbBupX97Hn+x9M9MLlSSd2Km8Q/jn\nVXXpyP9B8y3A3qraAOztbYBrgA392A7cDsMAAW4DrgSuAG47GiI95iMj+20+7RVJkk7LmVwy2gLs\n7PZO4LqR+l019DCwNMkq4GpgT1UdqapXgD3A5u67sKoerqoC7ho5liRplowbCAX8zyT7kmzv2sqq\nerHbLwEru70GeGFk3wNdO1H9wDT1N0myPclEkompqakxpy5JGseSMcf9elUdTPJPgT1JvjvaWVWV\npGZ+ej+vqu4A7gAYDAZn/fUkaTEZ6x1CVR3s58PANxh+BvByX+6hnw/38IPAxSO7r+3aieprp6lL\nkmbRSQMhyduT/NLRNrAJeALYBRy9U2gbcF+3dwFb+26jjcBrfWlpN7ApybL+MHkTsLv7Xk+yse8u\n2jpyLEnSLBnnktFK4Bt9J+gS4M+r6q+SPArck+Qm4Hngwz3+AeBaYBL4IXAjQFUdSfIp4NEe98mq\nOtLtm4EvARcAD/ZDkjSLMryx59wzGAxqYmLitPZdvXr6+qFDZzAhSZrnkuwb+erAm/hNZUkSYCBI\nkpqBIEkCDARJUjMQJEmAgSBJagaCJAkwECRJzUCQJAEGgiSpGQiSJMBAkCQ1A0GSBBgIkqRmIEiS\nAANBktQMBEkSYCBIkpqBIEkCTiEQkpyX5NtJ7u/t9UkeSTKZ5GtJzu/6W3t7svvXjRzj1q4/k+Tq\nkfrmrk0muWXmlidJGtepvEP4OPD0yPZngM9W1S8DrwA3df0m4JWuf7bHkeQS4HrgV4HNwB93yJwH\nfB64BrgEuKHHSpJm0ViBkGQt8FvAF3o7wPuAe3vITuC6bm/pbbr//T1+C3B3Vf2oqr4HTAJX9GOy\nqp6rqh8Dd/dYSdIsGvcdwh8Bvwv8Y2+/E3i1qt7o7QPAmm6vAV4A6P7XevxP68fsc7z6myTZnmQi\nycTU1NSYU5ckjeOkgZDkt4HDVbVvFuZzQlV1R1UNqmqwYsWKuZ6OJC0oS8YY817gg0muBd4GXAh8\nDliaZEm/C1gLHOzxB4GLgQNJlgDvAH4wUj9qdJ/j1SVJs+Sk7xCq6taqWltV6xh+KPzNqvod4CHg\nQz1sG3Bft3f1Nt3/zaqqrl/fdyGtBzYA3wIeBTb0XUvn92vsmpHVSZLGNs47hOP5z8DdSf4A+DZw\nZ9fvBL6cZBI4wvA/8FTVk0nuAZ4C3gA+WlU/AUjyMWA3cB7wxap68gzmJUk6DRn+8n7uGQwGNTEx\ncVr7rl49ff3QoTOYkCTNc0n2VdXgeP1+U1mSBBgIkqRmIEiSAANBktQMBEkSYCBIkpqBIEkCDARJ\nUjMQJEmAgSBJagaCJAkwECRJzUCQJAEGgiSpGQiSJMBAkCQ1A0GSBBgIkqRmIEiSgDECIcnbknwr\nyd8keTLJ73d9fZJHkkwm+VqS87v+1t6e7P51I8e6tevPJLl6pL65a5NJbpn5ZUqSTmacdwg/At5X\nVb8GXApsTrIR+Azw2ar6ZeAV4KYefxPwStc/2+NIcglwPfCrwGbgj5Ocl+Q84PPANcAlwA09VpI0\ni04aCDX0d735ln4U8D7g3q7vBK7r9pbepvvfnyRdv7uqflRV3wMmgSv6MVlVz1XVj4G7e6wkaRaN\n9RlC/yb/HeAwsAd4Fni1qt7oIQeANd1eA7wA0P2vAe8crR+zz/Hq081je5KJJBNTU1PjTF2SNKax\nAqGqflJVlwJrGf5G/ytndVbHn8cdVTWoqsGKFSvmYgqStGCd0l1GVfUq8BBwFbA0yZLuWgsc7PZB\n4GKA7n8H8IPR+jH7HK8uSZpF49xltCLJ0m5fAHwAeJphMHyoh20D7uv2rt6m+79ZVdX16/supPXA\nBuBbwKPAhr5r6XyGHzzvmonFSZLGt+TkQ1gF7Oy7gX4BuKeq7k/yFHB3kj8Avg3c2ePvBL6cZBI4\nwvA/8FTVk0nuAZ4C3gA+WlU/AUjyMWA3cB7wxap6csZWKEkaS4a/vJ97BoNBTUxMnNa+q1dPXz90\n6AwmJEnzXJJ9VTU4Xr/fVJYkAQaCJKkZCJIkwECQJDUDQZIEGAiSpGYgSJIAA0GS1AwESRJgIEiS\nmoEgSQIMBElSMxAkSYCBIElqBoIkCTAQJEnNQJAkAQaCJKkZCJIkYIxASHJxkoeSPJXkySQf7/ry\nJHuS7O/nZV1Pkh1JJpM8luSykWNt6/H7k2wbqV+e5PHeZ0eSnI3FSpKOb5x3CG8A/6mqLgE2Ah9N\ncglwC7C3qjYAe3sb4BpgQz+2A7fDMECA24ArgSuA246GSI/5yMh+m898aZKkU3HSQKiqF6vqr7v9\n/4CngTXAFmBnD9sJXNftLcBdNfQwsDTJKuBqYE9VHamqV4A9wObuu7CqHq6qAu4aOZYkaZac0mcI\nSdYB7wEeAVZW1Yvd9RKwsttrgBdGdjvQtRPVD0xTn+71tyeZSDIxNTV1KlOXJJ3E2IGQ5BeBvwQ+\nUVWvj/b1b/Y1w3N7k6q6o6oGVTVYsWLF2X45SVpUxgqEJG9hGAZfqaqvd/nlvtxDPx/u+kHg4pHd\n13btRPW109QlSbNonLuMAtwJPF1VfzjStQs4eqfQNuC+kfrWvttoI/BaX1raDWxKsqw/TN4E7O6+\n15Ns7NfaOnIsSdIsWTLGmPcC/xJ4PMl3uvZfgE8D9yS5CXge+HD3PQBcC0wCPwRuBKiqI0k+BTza\n4z5ZVUe6fTPwJeAC4MF+SJJmUYaX/889g8GgJiYmTmvf1aunrx86dAYTkqR5Lsm+qhocr99vKkuS\nAANBktQMBEkSYCBIkpqBIEkCDARJUjMQJEmAgSBJagaCJAkwECRJzUCQJAEGgiSpGQiSJMBAkCQ1\nA0GSBBgIkqRmIEiSAANBktQMBEkSMEYgJPliksNJnhipLU+yJ8n+fl7W9STZkWQyyWNJLhvZZ1uP\n359k20j98iSP9z47kmSmFylJOrlx3iF8Cdh8TO0WYG9VbQD29jbANcCGfmwHbodhgAC3AVcCVwC3\nHQ2RHvORkf2OfS1J0iw4aSBU1f8GjhxT3gLs7PZO4LqR+l019DCwNMkq4GpgT1UdqapXgD3A5u67\nsKoerqoC7ho5liRpFp3uZwgrq+rFbr8ErOz2GuCFkXEHunai+oFp6pKkWXbGHyr3b/Y1A3M5qSTb\nk0wkmZiampqNl5SkReN0A+HlvtxDPx/u+kHg4pFxa7t2ovraaerTqqo7qmpQVYMVK1ac5tQlSdM5\n3UDYBRy9U2gbcN9IfWvfbbQReK0vLe0GNiVZ1h8mbwJ2d9/rSTb23UVbR44lSZpFS042IMlXgd8E\nLkpygOHdQp8G7klyE/A88OEe/gBwLTAJ/BC4EaCqjiT5FPBoj/tkVR39oPpmhncyXQA82A9J0izL\n8COAc89gMKiJiYnT2nf16unrhw6dwYQkaZ5Lsq+qBsfr95vKkiTAQJAkNQNBkgQYCJKkZiBIkgAD\nQZLUDARJEmAgSJKagSBJAgwESVIzECRJgIEgSWoGgiQJMBAkSc1AkCQBBoIkqRkIkiTAQJAkNQNB\nkgQYCJKkNm8CIcnmJM8kmUxyy1zPR5IWm3kRCEnOAz4PXANcAtyQ5JK5nZUkLS5L5noC7Qpgsqqe\nA0hyN7AFeGo2J7F69fT1Q4dmcxaSNDfmSyCsAV4Y2T4AXHnsoCTbge29+XdJnjnN17sI+P64g5PT\nfJX55ZTWvEC45sXBNY/vn52oc74Ewliq6g7gjjM9TpKJqhrMwJTOGa55cXDNi8PZWvO8+AwBOAhc\nPLK9tmuSpFkyXwLhUWBDkvVJzgeuB3bN8ZwkaVGZF5eMquqNJB8DdgPnAV+sqifP4kue8WWnc5Br\nXhxc8+JwVtacqjobx5UknWPmyyUjSdIcMxAkScAiC4SF9Ocxklyc5KEkTyV5MsnHu748yZ4k+/t5\nWdeTZEev/bEkl40ca1uP359k21ytaVxJzkvy7ST39/b6JI/02r7WNyaQ5K29Pdn960aOcWvXn0ly\n9dysZDxJlia5N8l3kzyd5KqFfp6T/If+d/1Ekq8medtCO89JvpjkcJInRmozdl6TXJ7k8d5nRzLG\nN6qqalE8GH5Y/SzwLuB84G+AS+Z6XmewnlXAZd3+JeD/MPyzH/8VuKXrtwCf6fa1wINAgI3AI11f\nDjzXz8u6vWyu13eStf9H4M+B+3v7HuD6bv8J8O+6fTPwJ92+Hvhaty/p8/9WYH3/uzhvrtd1gvXu\nBP5Nt88Hli7k88zwi6rfAy4YOb//aqGdZ+A3gMuAJ0ZqM3ZegW/12PS+15x0TnP9Q5nFH/5VwO6R\n7VuBW+d6XjO4vvuADwDPAKu6tgp4ptt/CtwwMv6Z7r8B+NOR+s+Nm28Pht9R2Qu8D7i//7F/H1hy\n7HlmeNfaVd1e0uNy7LkfHTffHsA7+j+OOaa+YM8zP/vLBcv7vN0PXL0QzzOw7phAmJHz2n3fHan/\n3LjjPRbTJaPp/jzGmjmay4zqt8jvAR4BVlbVi931ErCy28db/7n2c/kj4HeBf+ztdwKvVtUbvT06\n/5+urftf6/Hn0prXA1PAn/Vlsi8keTsL+DxX1UHgvwH/F3iR4Xnbx8I+z0fN1Hld0+1j6ye0mAJh\nQUryi8BfAp+oqtdH+2r4q8GCua84yW8Dh6tq31zPZRYtYXhZ4faqeg/w9wwvJfzUAjzPyxj+ccv1\nwGrg7cDmOZ3UHJiL87qYAmHB/XmMJG9hGAZfqaqvd/nlJKu6fxVwuOvHW/+59HN5L/DBJH8L3M3w\nstHngKVJjn7JcnT+P11b978D+AHn1poPAAeq6pHevpdhQCzk8/wvgO9V1VRV/QPwdYbnfiGf56Nm\n6rwe7Pax9RNaTIGwoP48Rt8xcCfwdFX94UjXLuDonQbbGH62cLS+te9W2Ai81m9NdwObkizr38w2\ndW3eqapbq2ptVa1jeP6+WVW/AzwEfKiHHbvmoz+LD/X46vr1fXfKemADww/g5p2qegl4Icm7u/R+\nhn8WfsGeZ4aXijYm+Sf97/zomhfseR4xI+e1+15PsrF/hltHjnV8c/2hyix/gHMtw7txngV+b67n\nc4Zr+XWGbycfA77Tj2sZXjvdC+wH/hewvMeH4f8J0bPA48Bg5Fj/Gpjsx41zvbYx1/+b/Owuo3cx\n/B/6JPAXwFu7/rbenuz+d43s/3v9s3iGMe6+mOO1XgpM9Ln+HwzvJlnQ5xn4feC7wBPAlxneKbSg\nzjPwVYafkfwDw3eCN83keQUG/fN7FvjvHHNjwnQP/3SFJAlYXJeMJEknYCBIkgADQZLUDARJEmAg\nSJKagSBJAgwESVL7/27i0eAlIZFDAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "3.57512896650546 3.0 2.605938655784157\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAD4CAYAAADCb7BPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAASW0lEQVR4nO3df6zddX3H8edrVBTZpAVuCG3J2sVG\n05mpeIM1mmWTDQozlj+MwZjRmMb+IW46l2xlS0am/2iyyCRREiIoJEZk6EZj1K6rJMuWgNyqkx+V\nceeP0RbolfJjm4kOfe+P86k7XO+90M8t55zbPh/Jyfl+39/P93ze9x701e+Pc26qCkmSjtevjLsB\nSdLKZIBIkroYIJKkLgaIJKmLASJJ6rJq3A2MyrnnnlsbNmwYdxuStKLs37//R1U1tdC2UyZANmzY\nwMzMzLjbkKQVJckPF9vmKSxJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0MEElS\nl1Pmk+jLsXbtwvXDh0fbhyRNEo9AJEldDBBJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1\nMUAkSV0MEElSFwNEktTFAJEkdTFAJEldDBBJUpfnDZAkNyc5kuT+odrZSfYmebg9r2n1JLk+yWyS\n7yS5cGif7W38w0m2D9XfkOS+ts/1SdI7hyRpdF7IEchnga3zaruAfVW1CdjX1gEuAza1x07gBhiE\nAXAt8EbgIuDaY4HQxrx3aL+tPXNIkkbreQOkqv4ZODqvvA24pS3fAlwxVL+1Bu4GVic5H7gU2FtV\nR6vqSWAvsLVte0VV3V1VBdw677WOZw5J0gj1XgM5r6oebcuPAee15XXAI0PjDrbaUvWDC9R75vgl\nSXYmmUkyMzc39wJ/NEnSC7Hsi+jtyKFOQC8nfI6qurGqpqtqempq6kXoTJJOXb0B8vix00bt+Uir\nHwIuGBq3vtWWqq9foN4zhyRphHoDZDdw7E6q7cCdQ/Wr2p1SW4Cn22moPcAlSda0i+eXAHvatmeS\nbGl3X10177WOZw5J0giter4BST4P/A5wbpKDDO6m+ihwe5IdwA+Bd7bhXwEuB2aBHwPvAaiqo0k+\nAtzbxn24qo5dmH8fgzu9zgC+2h4c7xySpNHK4PLCyW96erpmZma69l27duH64cPLaEiSVoAk+6tq\neqFtfhJdktTFAJEkdTFAJEldDBBJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0M\nEElSFwNEktTFAJEkdTFAJEldDBBJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0M\nEElSFwNEktTFAJEkdTFAJEldDBBJUhcDRJLUZVkBkuRPkjyQ5P4kn0/ysiQbk9yTZDbJF5Kc3sa+\ntK3Ptu0bhl7nmlZ/KMmlQ/WtrTabZNdQfcE5JEmj0x0gSdYBfwxMV9VrgNOAK4GPAddV1SuBJ4Ed\nbZcdwJOtfl0bR5LNbb/fBLYCn0pyWpLTgE8ClwGbgXe1sSwxhyRpRJZ7CmsVcEaSVcDLgUeBtwJ3\ntO23AFe05W1tnbb94iRp9duq6idV9X1gFrioPWar6ntV9VPgNmBb22exOSRJI9IdIFV1CPgb4D8Z\nBMfTwH7gqap6tg07CKxry+uAR9q+z7bx5wzX5+2zWP2cJeZ4jiQ7k8wkmZmbm+v9USVJC1jOKaw1\nDI4eNgJrgTMZnIKaGFV1Y1VNV9X01NTUuNuRpJPKck5h/R7w/aqaq6r/Bb4EvBlY3U5pAawHDrXl\nQ8AFAG37WcATw/V5+yxWf2KJOSRJI7KcAPlPYEuSl7frEhcDDwJ3Ae9oY7YDd7bl3W2dtv3rVVWt\nfmW7S2sjsAn4BnAvsKndcXU6gwvtu9s+i80hSRqR5VwDuYfBhexvAve117oR+HPgQ0lmGVyvuKnt\nchNwTqt/CNjVXucB4HYG4fM14Oqq+lm7xvF+YA9wALi9jWWJOSRJI5LBP+hPftPT0zUzM9O179q1\nC9cPH15GQ5K0AiTZX1XTC23zk+iSpC4GiCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroY\nIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroY\nIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepigEiSuiwrQJKsTnJHku8mOZDkTUnOTrI3\nycPteU0bmyTXJ5lN8p0kFw69zvY2/uEk24fqb0hyX9vn+iRp9QXnkCSNznKPQD4BfK2qXg28FjgA\n7AL2VdUmYF9bB7gM2NQeO4EbYBAGwLXAG4GLgGuHAuEG4L1D+21t9cXmkCSNSHeAJDkL+G3gJoCq\n+mlVPQVsA25pw24BrmjL24Bba+BuYHWS84FLgb1VdbSqngT2AlvbtldU1d1VVcCt815roTkkSSOy\nnCOQjcAc8Jkk30ry6SRnAudV1aNtzGPAeW15HfDI0P4HW22p+sEF6iwxhyRpRJYTIKuAC4Ebqur1\nwP8w71RSO3KoZczxvJaaI8nOJDNJZubm5l7MNiTplLOcADkIHKyqe9r6HQwC5fF2+on2fKRtPwRc\nMLT/+lZbqr5+gTpLzPEcVXVjVU1X1fTU1FTXDylJWlh3gFTVY8AjSV7VShcDDwK7gWN3Um0H7mzL\nu4Gr2t1YW4Cn22moPcAlSda0i+eXAHvatmeSbGl3X10177UWmkOSNCKrlrn/HwGfS3I68D3gPQxC\n6fYkO4AfAu9sY78CXA7MAj9uY6mqo0k+Atzbxn24qo625fcBnwXOAL7aHgAfXWSOibB27eLbDh8e\nXR+S9GLK4BLCyW96erpmZma69l0sEBYLAwNE0skiyf6qml5om59ElyR1MUAkSV0MEElSFwNEktTF\nAJEkdTFAJEldDBBJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0MEElSFwNEktTF\nAJEkdTFAJEldDBBJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1MUAkSV0MEElSFwNEktTF\nAJEkdVl2gCQ5Lcm3kny5rW9Mck+S2SRfSHJ6q7+0rc+27RuGXuOaVn8oyaVD9a2tNptk11B9wTkk\nSaNzIo5APgAcGFr/GHBdVb0SeBLY0eo7gCdb/bo2jiSbgSuB3wS2Ap9qoXQa8EngMmAz8K42dqk5\nJEkjsqwASbIe+APg0209wFuBO9qQW4Ar2vK2tk7bfnEbvw24rap+UlXfB2aBi9pjtqq+V1U/BW4D\ntj3PHJKkEVnuEcjfAn8G/LytnwM8VVXPtvWDwLq2vA54BKBtf7qN/0V93j6L1Zea4zmS7Ewyk2Rm\nbm6u92eUJC2gO0CSvA04UlX7T2A/J1RV3VhV01U1PTU1Ne52JOmksmoZ+74ZeHuSy4GXAa8APgGs\nTrKqHSGsBw618YeAC4CDSVYBZwFPDNWPGd5nofoTS8whSRqR7iOQqrqmqtZX1QYGF8G/XlXvBu4C\n3tGGbQfubMu72zpt+9erqlr9ynaX1kZgE/AN4F5gU7vj6vQ2x+62z2JzSJJG5MX4HMifAx9KMsvg\nesVNrX4TcE6rfwjYBVBVDwC3Aw8CXwOurqqftaOL9wN7GNzldXsbu9QckqQRyeAf9Ce/6enpmpmZ\n6dp37dqF64cPH9/4pfaRpEmUZH9VTS+0zU+iS5K6GCCSpC4GiCSpiwEiSepigEiSuhggkqQuBogk\nqYsBIknqYoBIkros58sUT3lLfeJckk52HoFIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC4GiCSp\niwEiSepigEiSuhggkqQuBogkqYsBIknqYoBIkroYIJKkLgaIJKmLfw9kQiz2t0UOHx5tH5L0QnkE\nIknqYoBIkroYIJKkLt0BkuSCJHcleTDJA0k+0OpnJ9mb5OH2vKbVk+T6JLNJvpPkwqHX2t7GP5xk\n+1D9DUnua/tcnyRLzSFJGp3lHIE8C/xpVW0GtgBXJ9kM7AL2VdUmYF9bB7gM2NQeO4EbYBAGwLXA\nG4GLgGuHAuEG4L1D+21t9cXmkCSNSHeAVNWjVfXNtvxfwAFgHbANuKUNuwW4oi1vA26tgbuB1UnO\nBy4F9lbV0ap6EtgLbG3bXlFVd1dVAbfOe62F5pAkjcgJuQaSZAPweuAe4LyqerRtegw4ry2vAx4Z\n2u1gqy1VP7hAnSXmmN/XziQzSWbm5uaO/weTJC1q2QGS5FeBLwIfrKpnhre1I4da7hxLWWqOqrqx\nqqaranpqaurFbEOSTjnLCpAkL2EQHp+rqi+18uPt9BPt+UirHwIuGNp9fastVV+/QH2pOSRJI7Kc\nu7AC3AQcqKqPD23aDRy7k2o7cOdQ/ap2N9YW4Ol2GmoPcEmSNe3i+SXAnrbtmSRb2lxXzXutheaQ\nJI3Icr7K5M3AHwL3Jfl2q/0F8FHg9iQ7gB8C72zbvgJcDswCPwbeA1BVR5N8BLi3jftwVR1ty+8D\nPgucAXy1PVhiDknSiHQHSFX9C5BFNl+8wPgCrl7ktW4Gbl6gPgO8ZoH6EwvNIUkaHT+JLknqYoBI\nkroYIJKkLv49kBFb7O9+SNJK4xGIJKmLASJJ6mKASJK6GCCSpC4GiCSpiwEiSepigEiSuhggkqQu\nBogkqYsBIknqYoBIkroYIJKkLgaIJKmLASJJ6mKASJK6GCCSpC7+QakJd7x/gOrw4RenD0mazyMQ\nSVIXA0SS1MUAkSR1MUAkSV0MEElSFwNEktTFAJEkdTFAJEldVmyAJNma5KEks0l2jbsfSTrVrMhP\noic5Dfgk8PvAQeDeJLur6sHxdjZ+i31y3U+oSzrRVmSAABcBs1X1PYAktwHbgFM+QBZzvF+JshiD\nSNIxKzVA1gGPDK0fBN44f1CSncDOtvrfSR56ga9/LvCjZXU4GiPvM+nabSX8PldCj2CfJ9JK6BHG\n3+evL7ZhpQbIC1JVNwI3Hu9+SWaqavpFaOmEss8TZyX0CPZ5Iq2EHmGy+1ypF9EPARcMra9vNUnS\niKzUALkX2JRkY5LTgSuB3WPuSZJOKSvyFFZVPZvk/cAe4DTg5qp64AROcdynvcbEPk+cldAj2OeJ\ntBJ6hAnuM1U17h4kSSvQSj2FJUkaMwNEktTFAJlnUr8iJcnNSY4kuX+odnaSvUkebs9rxtzjBUnu\nSvJgkgeSfGBC+3xZkm8k+bfW51+3+sYk97T3/gvtBo2xSnJakm8l+fIE9/iDJPcl+XaSmVabqPe8\n9bQ6yR1JvpvkQJI3TVKfSV7VfofHHs8k+eAk9TifATJk6CtSLgM2A+9Ksnm8Xf3CZ4Gt82q7gH1V\ntQnY19bH6VngT6tqM7AFuLr9/iatz58Ab62q1wKvA7Ym2QJ8DLiuql4JPAnsGGOPx3wAODC0Pok9\nAvxuVb1u6PMKk/aeA3wC+FpVvRp4LYPf68T0WVUPtd/h64A3AD8G/n6SevwlVeWjPYA3AXuG1q8B\nrhl3X0P9bADuH1p/CDi/LZ8PPDTuHuf1eyeD7yub2D6BlwPfZPBNBj8CVi3038KYelvP4P8w3gp8\nGcik9dj6+AFw7rzaRL3nwFnA92k3Dk1qn0N9XQL86yT3WFUegcyz0FekrBtTLy/EeVX1aFt+DDhv\nnM0MS7IBeD1wDxPYZzs19G3gCLAX+A/gqap6tg2ZhPf+b4E/A37e1s9h8noEKOAfk+xvXx8Ek/ee\nbwTmgM+0U4KfTnImk9fnMVcCn2/Lk9qjAXKyqME/Tybinuwkvwp8EfhgVT0zvG1S+qyqn9XgVMF6\nBl/O+eoxt/QcSd4GHKmq/ePu5QV4S1VdyODU79VJfnt444S856uAC4Ebqur1wP8w71TQhPRJu671\nduDv5m+blB6PMUCea6V9RcrjSc4HaM9HxtwPSV7CIDw+V1VfauWJ6/OYqnoKuIvB6aDVSY59uHbc\n7/2bgbcn+QFwG4PTWJ9gsnoEoKoOtecjDM7ZX8TkvecHgYNVdU9bv4NBoExanzAI4m9W1eNtfRJ7\nBAyQ+VbaV6TsBra35e0MrjmMTZIANwEHqurjQ5smrc+pJKvb8hkMrtMcYBAk72jDxtpnVV1TVeur\nagOD/w6/XlXvZoJ6BEhyZpJfO7bM4Nz9/UzYe15VjwGPJHlVK13M4M8/TFSfzbv4/9NXMJk9Doz7\nIsykPYDLgX9ncE78L8fdz1BfnwceBf6Xwb+mdjA4J74PeBj4J+DsMff4FgaH198Bvt0el09gn78F\nfKv1eT/wV63+G8A3gFkGpw9eOu73vfX1O8CXJ7HH1s+/tccDx/43M2nveevpdcBMe9//AVgzaX0C\nZwJPAGcN1Saqx+GHX2UiSeriKSxJUhcDRJLUxQCRJHUxQCRJXQwQSVIXA0SS1MUAkSR1+T+VmtYx\nRwH1zAAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SoxdyJtzB94A",
        "colab_type": "text"
      },
      "source": [
        "As you can see, there are HTML elements left with < and > occurring quite often in your comments dataset that may make it harder for your model to learn to generate the comments that contain those elements. Luckily for us, it won't really affect your models accuracy, but exploring your data like this does allow us to see how your data may be influencing your model.\n",
        "\n",
        "**TODO For You:** Perform some further cleaning steps to remove HTML and any other cleaning you deem necessary and see how your performance changes."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "h_E41kXU0Adu",
        "colab_type": "text"
      },
      "source": [
        "## Loading the data using FastAI\n",
        "Now that you have the data processed and cleaned you need a way to get it into the format that FastAI uses. To do that you will use some code from Rachel Thomas' awesome [course on NLP](https://github.com/fastai/course-nlp/blob/master/7-seq2seq-translation.ipynb), which allows us to create a Sequence to Sequence (since you are going from the sequence of code to the sequence of the code's docstring) DataBunch (this is just the format FastAI uses for managing loading the data into memory for training and evaluating)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nHK-R9ABP8Mk",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def seq2seq_collate(samples, pad_idx=1, pad_first=True, backwards=False):\n",
        "    \"Function that collect samples and adds padding. Flips token order if needed\"\n",
        "    samples = to_data(samples)\n",
        "    max_len_x,max_len_y = max([len(s[0]) for s in samples]),max([len(s[1]) for s in samples])\n",
        "    res_x = torch.zeros(len(samples), max_len_x).long() + pad_idx\n",
        "    res_y = torch.zeros(len(samples), max_len_y).long() + pad_idx\n",
        "    if backwards: pad_first = not pad_first\n",
        "    for i,s in enumerate(samples):\n",
        "        if pad_first: \n",
        "            res_x[i,-len(s[0]):],res_y[i,-len(s[1]):] = LongTensor(s[0]),LongTensor(s[1])\n",
        "        else:         \n",
        "            res_x[i, :len(s[0])],res_y[i, :len(s[1])] = LongTensor(s[0]),LongTensor(s[1])\n",
        "    if backwards: res_x,res_y = res_x.flip(1),res_y.flip(1)\n",
        "    return res_x, res_y\n",
        "\n",
        "class Seq2SeqDataBunch(TextDataBunch):\n",
        "    \"Create a `TextDataBunch` suitable for training an RNN classifier.\"\n",
        "    @classmethod\n",
        "    def create(cls, train_ds, valid_ds, test_ds=None, path='.', bs=32, val_bs=None, pad_idx=1,\n",
        "               dl_tfms=None, pad_first=False, device=None, no_check=False, backwards=False, **dl_kwargs):\n",
        "        \"Function that transform the `datasets` in a `DataBunch` for classification. Passes `**dl_kwargs` on to `DataLoader()`\"\n",
        "        datasets = cls._init_ds(train_ds, valid_ds, test_ds)\n",
        "        val_bs = ifnone(val_bs, bs)\n",
        "        collate_fn = partial(seq2seq_collate, pad_idx=pad_idx, pad_first=pad_first, backwards=backwards)\n",
        "        train_sampler = SortishSampler(datasets[0].x, key=lambda t: len(datasets[0][t][0].data), bs=bs//2)\n",
        "        train_dl = DataLoader(datasets[0], batch_size=bs, sampler=train_sampler, drop_last=True, **dl_kwargs)\n",
        "        dataloaders = [train_dl]\n",
        "        for ds in datasets[1:]:\n",
        "            lengths = [len(t) for t in ds.x.items]\n",
        "            sampler = SortSampler(ds.x, key=lengths.__getitem__)\n",
        "            dataloaders.append(DataLoader(ds, batch_size=val_bs, sampler=sampler, **dl_kwargs))\n",
        "        return cls(*dataloaders, path=path, device=device, collate_fn=collate_fn, no_check=no_check)\n",
        "\n",
        "class Seq2SeqTextList(TextList):\n",
        "    _bunch = Seq2SeqDataBunch\n",
        "    _label_cls = TextList"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VVuhlIjq4ghe",
        "colab_type": "text"
      },
      "source": [
        "Here is where you are telling FastAI to use your trained BPE models for tokenizing your data. FastAI's tokenizers will also do some additional processing of your text such as lower casing all words, removing repetitions, etc. You can find a full list of the processing FastAI uses [here](https://docs.fast.ai/text.transform.html#Tokenizer)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XnfC0fa8YibA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "method_processor = SPProcessor(\n",
        "    sp_model = path / (method_tokenizer + \".model\"),\n",
        "    sp_vocab = path / (method_tokenizer + \".vocab\"),\n",
        "    include_eos = True)\n",
        "comment_processor = SPProcessor(\n",
        "    sp_model = path / (comment_tokenizer + \".model\"),\n",
        "    sp_vocab = path / (comment_tokenizer + \".vocab\"),\n",
        "    include_eos = True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "C7QFCQkN0PL6",
        "colab_type": "text"
      },
      "source": [
        "Now that you have your BPE model you will generate the DataBunches suitable for your task, which will be the Seq2Seq DataBunch. You will also filter out sequences that your too long so that you can fit everything onto a Google Colab GPU and to not have your training take too long."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PbKA_-EP4530",
        "colab_type": "code",
        "outputId": "1ed2031b-e146-44da-df5f-9acf3f4106f6",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 374
        }
      },
      "source": [
        "#collapse_show\n",
        "def gen_dbs(df_trn, df_val, df_tst, method_processor, comment_processor, bs = 96, max_seq = 128):\n",
        "    is_valid = [False] * len(df_trn) + [True] * len(df_val)\n",
        "    df_merged = pd.concat([df_trn, df_val])\n",
        "    df_merged = pd.DataFrame(zip(df_merged[\"code\"].to_list(), df_merged[\"docstring\"].to_list(), is_valid),\n",
        "                                columns = [\"code\", \"docstring\", \"valid\"]\n",
        "    )\n",
        "                             \n",
        "    db_trn = (Seq2SeqTextList\n",
        "              .from_df(df_merged, path = path, cols='code', processor = method_processor)\n",
        "              .split_from_df(col='valid')\n",
        "              .label_from_df(cols='docstring', label_cls=TextList, processor = comment_processor)\n",
        "              .filter_by_func(lambda x, y: len(x) > max_seq or len(y) > max_seq)\n",
        "              .databunch(bs = bs))\n",
        "    \n",
        "    db_tst = (Seq2SeqTextList\n",
        "              .from_df(df_tst, path = path, cols='code', processor = method_processor)\n",
        "              .split_by_rand_pct(valid_pct = 0.01)\n",
        "              .label_from_df(cols='docstring', label_cls=TextList, processor = comment_processor)\n",
        "              .filter_by_func(lambda x, y: len(x) > max_seq or len(y) > max_seq)\n",
        "              .databunch(bs = 16))\n",
        "    \n",
        "    return db_trn, db_tst\n",
        "\n",
        "db_trn, db_tst = gen_dbs(df_trn, df_val, df_tst, method_processor, comment_processor, bs = 96, max_seq = 128)\n",
        "db_trn.show_batch()"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              ""
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th>text</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <td>▁ xx b os ▁boolean ▁res er ve ( int ▁column , ▁int ▁size ) ▁{ ▁if ▁( ( column ▁&lt; ▁0) ▁|| ▁( ( column ▁+ ▁size ) ▁&gt; ▁columns )) ▁throw ▁new ▁index out of bound s exception (\" res er ve ▁- ▁ inc or rec t ▁column ▁/ ▁size \"); ▁for ( int ▁i = column ; ▁i ▁&lt; ▁column ▁+ ▁size ; ▁i ++) ▁{</td>\n",
              "      <td>▁ xx b os ▁ xx ma j ▁re s er ve s ▁a ▁&lt; code &gt; ce ll &lt; ▁/ ▁ xx up ▁code &gt; ▁in ▁the ▁&lt; code &gt; ro w &lt; ▁/ ▁ xx up ▁code &gt; . ▁ xx e os</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>▁ xx b os ▁@ help ( ▁ help ▁= ▁\" get ▁all ▁the ▁ virtual network function descriptor ▁ xx m a j ▁dependency ▁of ▁a ▁network service descriptor ▁with ▁specific ▁id \" ▁) ▁public ▁list &lt; v n f d ep end en cy &gt; ▁get v n f dependencies ( final ▁ xx m a j ▁string ▁id ns d ) ▁throws ▁s d k exception ▁{</td>\n",
              "      <td>▁ xx b os ▁ xx ma j ▁return ▁a ▁ xx ma j ▁list ▁with ▁all ▁the ▁v n f de p end en c ies ▁that ▁are ▁contain ed ▁in ▁a ▁specific ▁network service descriptor . ▁ xx e os</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>▁ xx b os ▁@ override ▁public ▁void ▁delete as set and attachment s ( final ▁ xx m a j ▁string ▁as set id ) ▁throws ▁ io exception , ▁request failure exception ▁{ ▁ xx m a j ▁as set ▁as s ▁= ▁get un v er ified as set ( as set id ); ▁list &lt; attachment &gt; ▁attachment s ▁= ▁as s . get attachment s</td>\n",
              "      <td>▁ xx b os ▁ xx ma j ▁this ▁will ▁delete ▁an ▁asset ▁and ▁all ▁its ▁attachments ▁ xx e os</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>▁ xx b os ▁public ▁list &lt; character book mark fold er s response &gt; ▁get character s character id book mark s fold er s ( integer ▁character id , ▁ xx m a j ▁string ▁data source , ▁ xx m a j ▁string ▁if n one match , ▁ xx m a j ▁ integer ▁page , ▁ xx m a j ▁string ▁token ) ▁throws ▁api</td>\n",
              "      <td>▁ xx b os ▁ xx ma j ▁list ▁ bookmark ▁folders ▁a ▁list ▁of ▁your ▁character &amp; ' s ▁personal ▁ bookmark ▁folders ▁--- ▁ xx ma j ▁this ▁route ▁is ▁cached ▁for ▁up ▁to ▁36 00 ▁seconds ▁ xx up ▁ s so ▁ xx ma j ▁scope : ▁ esi - bookmark s . read _ character _ bookmark s . v 1 ▁ xx e os</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>▁ xx b os ▁@ de pre c ated ▁protected ▁final ▁map &lt; db id , ▁k n n list &gt; ▁batch n n ( n ▁node , ▁db id s ▁ids , ▁int ▁k max ) ▁{ ▁map &lt; db id , ▁k n n list &gt; ▁res ▁= ▁new ▁hash map &lt;&gt;( id s . size ()); ▁for ( db id iter ▁iter ▁= ▁ids . iter ();</td>\n",
              "      <td>▁ xx b os ▁ xx ma j ▁perform s ▁a ▁batch ▁k - ne a rest ▁neighbor ▁query ▁for ▁a ▁list ▁of ▁query ▁objects . ▁ xx e os</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "QagTGWHuvnH4",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#hide\n",
        "db_trn.save(\"method_commenter_db_trn.pkl\")\n",
        "db_tst.save(\"method_commenter_db_tst.pkl\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "AmL6GmYNQEHz",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def shift_tfm(b):\n",
        "    x,y = b\n",
        "    y = F.pad(y, (1, 0), value=1)\n",
        "    return [x,y[:,:-1]], y[:,1:]\n",
        "\n",
        "# Add the necessary shift transformation for training your Transformer model\n",
        "db_trn.add_tfm(shift_tfm)\n",
        "db_tst.add_tfm(shift_tfm)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "WXvycFsk5Gkb",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#hide\n",
        "# Run if you already have generated and saved some data\n",
        "# db_trn = load_data(path, 'method_commenter_db_trn.pkl', bs = 96)\n",
        "# db_tst = load_data(path, 'method_commenter_db_tst.pkl', bs = 16)\n",
        "# db_trn.add_tfm(shift_tfm)\n",
        "# db_tst.add_tfm(shift_tfm)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "W6wb-6uL0V1D",
        "colab_type": "text"
      },
      "source": [
        "# Defining your model\n",
        "\n",
        "In this example, you will be using the Transformer architecture that was developed by [Vaswani et. al.](https://arxiv.org/abs/1706.03762). If you want a better understanding of this model, I highly suggest [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html#applications-of-attention-in-our-model) blog post and the [NLP course](https://www.youtube.com/playlist?list=PLtmWHNX-gukKocXQOkQjuVxglSDYWsSh9) by Rachel Thomas, which this model code is copied from."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_-7BtjIhcIyn",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse\n",
        "class PositionalEncoding(nn.Module):\n",
        "    \"Encode the position with a sinusoid.\"\n",
        "    def __init__(self, d):\n",
        "        super().__init__()\n",
        "        self.register_buffer('freq', 1 / (10000 ** (torch.arange(0., d, 2.)/d)))\n",
        "    \n",
        "    def forward(self, pos):\n",
        "        inp = torch.ger(pos, self.freq)\n",
        "        enc = torch.cat([inp.sin(), inp.cos()], dim=-1)\n",
        "        return enc\n",
        "\n",
        "class TransformerEmbedding(nn.Module):\n",
        "    \"Embedding + positional encoding + dropout\"\n",
        "    def __init__(self, vocab_sz, emb_sz, inp_p=0.):\n",
        "        super().__init__()\n",
        "        self.emb_sz = emb_sz\n",
        "        self.embed = embedding(vocab_sz, emb_sz)\n",
        "        self.pos_enc = PositionalEncoding(emb_sz)\n",
        "        self.drop = nn.Dropout(inp_p)\n",
        "    \n",
        "    def forward(self, inp): \n",
        "        pos = torch.arange(0, inp.size(1), device=inp.device).float()\n",
        "        return self.drop(self.embed(inp) * math.sqrt(self.emb_sz) + self.pos_enc(pos))\n",
        "\n",
        "def feed_forward(d_model, d_ff, ff_p=0., double_drop=True):\n",
        "    layers = [nn.Linear(d_model, d_ff), nn.ReLU()]\n",
        "    if double_drop: layers.append(nn.Dropout(ff_p))\n",
        "    return SequentialEx(*layers, nn.Linear(d_ff, d_model), nn.Dropout(ff_p), MergeLayer(), nn.LayerNorm(d_model))\n",
        "\n",
        "class MultiHeadAttention(nn.Module):\n",
        "    def __init__(self, n_heads, d_model, d_head=None, p=0., bias=True, scale=True):\n",
        "        super().__init__()\n",
        "        d_head = ifnone(d_head, d_model//n_heads)\n",
        "        self.n_heads,self.d_head,self.scale = n_heads,d_head,scale\n",
        "        self.q_wgt,self.k_wgt,self.v_wgt = [nn.Linear(\n",
        "            d_model, n_heads * d_head, bias=bias) for o in range(3)]\n",
        "        self.out = nn.Linear(n_heads * d_head, d_model, bias=bias)\n",
        "        self.drop_att,self.drop_res = nn.Dropout(p),nn.Dropout(p)\n",
        "        self.ln = nn.LayerNorm(d_model)\n",
        "        \n",
        "    def forward(self, q, kv, mask=None):\n",
        "        return self.ln(q + self.drop_res(self.out(self._apply_attention(q, kv, mask=mask))))\n",
        "    \n",
        "    def create_attn_mat(self, x, layer, bs):\n",
        "        return layer(x).view(bs, x.size(1), self.n_heads, self.d_head\n",
        "                            ).permute(0, 2, 1, 3)\n",
        "    \n",
        "    def _apply_attention(self, q, kv, mask=None):\n",
        "        bs,seq_len = q.size(0),q.size(1)\n",
        "        wq,wk,wv = map(lambda o: self.create_attn_mat(*o,bs),\n",
        "                       zip((q,kv,kv),(self.q_wgt,self.k_wgt,self.v_wgt)))\n",
        "        attn_score = wq @ wk.transpose(2,3)\n",
        "        if self.scale: attn_score /= math.sqrt(self.d_head)\n",
        "        if mask is not None: \n",
        "            attn_score = attn_score.float().masked_fill(mask, -float('inf')).type_as(attn_score)\n",
        "        attn_prob = self.drop_att(F.softmax(attn_score, dim=-1))\n",
        "        attn_vec = attn_prob @ wv\n",
        "        return attn_vec.permute(0, 2, 1, 3).contiguous().view(bs, seq_len, -1)\n",
        "\n",
        "def get_output_mask(inp, pad_idx=1):\n",
        "    return torch.triu(inp.new_ones(inp.size(1),inp.size(1)), diagonal=1)[None,None].byte()\n",
        "\n",
        "class EncoderBlock(nn.Module):\n",
        "    \"Encoder block of a Transformer model.\"\n",
        "    #Can't use Sequential directly cause more than one input...\n",
        "    def __init__(self, n_heads, d_model, d_head, d_inner, p=0., bias=True, scale=True, double_drop=True):\n",
        "        super().__init__()\n",
        "        self.mha = MultiHeadAttention(n_heads, d_model, d_head, p=p, bias=bias, scale=scale)\n",
        "        self.ff  = feed_forward(d_model, d_inner, ff_p=p, double_drop=double_drop)\n",
        "    \n",
        "    def forward(self, x, mask=None): return self.ff(self.mha(x, x, mask=mask))\n",
        "\n",
        "class DecoderBlock(nn.Module):\n",
        "    \"Decoder block of a Transformer model.\"\n",
        "    #Can't use Sequential directly cause more than one input...\n",
        "    def __init__(self, n_heads, d_model, d_head, d_inner, p=0., bias=True, scale=True, double_drop=True):\n",
        "        super().__init__()\n",
        "        self.mha1 = MultiHeadAttention(n_heads, d_model, d_head, p=p, bias=bias, scale=scale)\n",
        "        self.mha2 = MultiHeadAttention(n_heads, d_model, d_head, p=p, bias=bias, scale=scale)\n",
        "        self.ff   = feed_forward(d_model, d_inner, ff_p=p, double_drop=double_drop)\n",
        "    \n",
        "    def forward(self, x, enc, mask_out=None): return self.ff(self.mha2(self.mha1(x, x, mask_out), enc))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Mck59t-2cY6x",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "class Transformer(Module):\n",
        "    def __init__(self, inp_vsz, out_vsz, n_layers=6, n_heads=8, d_model=256, d_head=32, \n",
        "                 d_inner=1024, p=0.1, bias=True, scale=True, double_drop=True, pad_idx=1):\n",
        "        self.enc_emb = TransformerEmbedding(inp_vsz, d_model, p)\n",
        "        self.dec_emb = TransformerEmbedding(out_vsz, d_model, 0.)\n",
        "        args = (n_heads, d_model, d_head, d_inner, p, bias, scale, double_drop)\n",
        "        self.encoder = nn.ModuleList([EncoderBlock(*args) for _ in range(n_layers)])\n",
        "        self.decoder = nn.ModuleList([DecoderBlock(*args) for _ in range(n_layers)])\n",
        "        self.out = nn.Linear(d_model, out_vsz)\n",
        "        self.out.weight = self.dec_emb.embed.weight\n",
        "        self.pad_idx = pad_idx\n",
        "        \n",
        "    def forward(self, inp, out):\n",
        "        mask_out = get_output_mask(out, self.pad_idx)\n",
        "        enc,out = self.enc_emb(inp),self.dec_emb(out)\n",
        "        enc = compose(self.encoder)(enc)\n",
        "        out = compose(self.decoder)(out, enc, mask_out)\n",
        "        return self.out(out)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mfFuKpxH0a9a",
        "colab_type": "text"
      },
      "source": [
        "To evaluate your model you will be using the commonly used BLEU score, which is a measure for determining how closely your model's generated comment is to the real comment of a method. (This code is also copied from the NLP tutorial from Rachel Thomas)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rYejBbsnccpi",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse\n",
        "class NGram():\n",
        "    def __init__(self, ngram, max_n=5000): self.ngram,self.max_n = ngram,max_n\n",
        "    def __eq__(self, other):\n",
        "        if len(self.ngram) != len(other.ngram): return False\n",
        "        return np.all(np.array(self.ngram) == np.array(other.ngram))\n",
        "    def __hash__(self): return int(sum([o * self.max_n**i for i,o in enumerate(self.ngram)]))\n",
        "\n",
        "def get_grams(x, n, max_n=5000):\n",
        "    return x if n==1 else [NGram(x[i:i+n], max_n=max_n) for i in range(len(x)-n+1)]\n",
        "\n",
        "def get_correct_ngrams(pred, targ, n, max_n=5000):\n",
        "    pred_grams,targ_grams = get_grams(pred, n, max_n=max_n),get_grams(targ, n, max_n=max_n)\n",
        "    pred_cnt,targ_cnt = Counter(pred_grams),Counter(targ_grams)\n",
        "    return sum([min(c, targ_cnt[g]) for g,c in pred_cnt.items()]),len(pred_grams)\n",
        "\n",
        "class CorpusBLEU(Callback):\n",
        "    def __init__(self, vocab_sz):\n",
        "        self.vocab_sz = vocab_sz\n",
        "        self.name = 'bleu'\n",
        "    \n",
        "    def on_epoch_begin(self, **kwargs):\n",
        "        self.pred_len,self.targ_len,self.corrects,self.counts = 0,0,[0]*4,[0]*4\n",
        "    \n",
        "    def on_batch_end(self, last_output, last_target, **kwargs):\n",
        "        last_output = last_output.argmax(dim=-1)\n",
        "        for pred,targ in zip(last_output.cpu().numpy(),last_target.cpu().numpy()):\n",
        "            self.pred_len += len(pred)\n",
        "            self.targ_len += len(targ)\n",
        "            for i in range(4):\n",
        "                c,t = get_correct_ngrams(pred, targ, i+1, max_n=self.vocab_sz)\n",
        "                self.corrects[i] += c\n",
        "                self.counts[i]   += t\n",
        "    \n",
        "    def on_epoch_end(self, last_metrics, **kwargs):\n",
        "        precs = [c/t for c,t in zip(self.corrects,self.counts)]\n",
        "        len_penalty = exp(1 - self.targ_len/self.pred_len) if self.pred_len < self.targ_len else 1\n",
        "        bleu = len_penalty * ((precs[0]*precs[1]*precs[2]*precs[3]) ** 0.25)\n",
        "        return add_metrics(last_metrics, bleu)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HTlGmcjBcj1O",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "n_x_vocab, n_y_vocab = len(db_trn.train_ds.x.vocab.itos), len(db_trn.train_ds.y.vocab.itos)\n",
        "\n",
        "model = Transformer(n_x_vocab, n_y_vocab, d_model=256)\n",
        "learn = Learner(db_trn, model, metrics=[accuracy, CorpusBLEU(n_y_vocab)], loss_func = CrossEntropyFlat())"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IB92USy2wXy7",
        "colab_type": "text"
      },
      "source": [
        "Now you are going to use the awesome Learning Rate finder provided by FastAI, which is based on the awesome paper from Leslie N. Smith [\"Cyclical Learning Rates for Training Neural Networks\"](https://arxiv.org/abs/1506.01186). This way you don't have to do a bunch of hyperparameter searching to find the perfect fit."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "UXQDrIT_cldl",
        "colab_type": "code",
        "outputId": "41473647-e0a5-4cc9-95fb-0fc010dd1657",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 408
        }
      },
      "source": [
        "learn.lr_find()\n",
        "learn.recorder.plot(suggestion = True)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              "\n",
              "    <div>\n",
              "        <style>\n",
              "            /* Turns off some styling */\n",
              "            progress {\n",
              "                /* gets rid of default border in Firefox and Opera. */\n",
              "                border: none;\n",
              "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
              "                background-size: auto;\n",
              "            }\n",
              "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
              "                background: #F44336;\n",
              "            }\n",
              "        </style>\n",
              "      <progress value='0' class='' max='1', style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
              "      0.00% [0/1 00:00<00:00]\n",
              "    </div>\n",
              "    \n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: left;\">\n",
              "      <th>epoch</th>\n",
              "      <th>train_loss</th>\n",
              "      <th>valid_loss</th>\n",
              "      <th>accuracy</th>\n",
              "      <th>bleu</th>\n",
              "      <th>time</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "  </tbody>\n",
              "</table><p>\n",
              "\n",
              "    <div>\n",
              "        <style>\n",
              "            /* Turns off some styling */\n",
              "            progress {\n",
              "                /* gets rid of default border in Firefox and Opera. */\n",
              "                border: none;\n",
              "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
              "                background-size: auto;\n",
              "            }\n",
              "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
              "                background: #F44336;\n",
              "            }\n",
              "        </style>\n",
              "      <progress value='92' class='' max='423', style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
              "      21.75% [92/423 01:21<04:53 19.5654]\n",
              "    </div>\n",
              "    "
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n",
            "Min numerical gradient: 3.02E-03\n",
            "Min loss divided by 10: 1.74E-02\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3deXxU1fn48c8z2XdCEkIgyA4CsggR\nBRVExbpVaqUqdQFti1ttq239+v22P9tqrXaxWqUuuNC6VNsiWrVudRcVaNg3RXYStiyQfc/z+2Nu\nYoyBhDB37kzyvF+veWXm3jv3Pocheeacc885oqoYY4wxAD6vAzDGGBM6LCkYY4xpZknBGGNMM0sK\nxhhjmllSMMYY0yzS6wCOVHp6ug4YMMDrMIwxJqwsX768UFUz2jsu7JLCgAEDyM3N9ToMY4wJKyKy\noyPHWfORMcaYZpYUjDHGNLOkYIwxppklBWOMMc0sKRhjjGlmScEYY0wzSwrGGGOahd04hc7atK+M\nf6/ZA4ACqIIIsVE+YiMjiI2KIC7aR1xUJHHREcRHRxAbGYGiNM0uHuET4qMjSIyNJDEmkrioCETE\nszIZY0ygdauk8Ke3Pw/oOUUgITqShJgI52ckSU7CSIqNIik2kpS4KFLiokiOiyI5NpLE2EiSYqJI\njI0k2dkfGWEVNmNMaOg2SeG80Vmcd1fWl77Zqyo19Y1U1zVQXddIZW09VXUNVNU2UFnbQE19I4L/\nj78I1DcoFbX1lNc0UFFT7zyc57X1lNfUU15dz86KSsqq6ymtqqOspr7d2JJiI0mNjyYjKYbeKbH0\nTo4lKyWWrJQ4+qbG0bdHHOmJ0VYrMca4rtskhbb+oIoIsVH+piO3NDQqZdV1lFTVUVb9ReIoq6mj\ntKqeA5W1HKys40BlLQVlNWzcXcrbG/dRXdf4pfPERPrITo2jX894jukZT7/UeAamJzA0M5Hs1Hgi\nfJYwjDFHr9skBa9E+IQe8dH0iI/u8HtUlZKqOnYfrCb/YBX5ByrJP1jFruIqdh2oZPmOA5RVf1ED\niY70MSg9gRFZyYzqk8zIrGRG9UkhJT7KjSIZY7owSwohSOSLRDKyT3KbxxysrGVLQQVb9pezuaCc\nTfvK+GRLES+szG8+Jjs1jtF9UzjOeYzNTjmi5GSM6X4sKYSpHvHRTOgfzYT+qV/aXlhew4bdpazf\nXcq63SWsyy/htXV7m/cPTE9gXL8eHH9MD04alMbQXonWV2GMaWZJoYtJT4xhyrAMpgz7Ytr0kqo6\n1ueXsCrvICt3HmTx5sLmGkWvpBhOGZLOyUPSOf3YXqQmWE3CmO7MkkI3kBIXxeQh6Uwekg74+yzy\nDlTx8ZZCFm8u4r1NBSxamU+ETzhpUE/OPi6Lr43KpFdSrMeRG2OCTbRpZJYbJxe5Cfgu/vFia4Gr\nVLW6xf45wO+Bpobwear62OHOmZOTo7bITmA1Nirrd5fy+vo9vLZ2L1sLKxCBcf16MH1kJtNHZDLE\nmpmMCWsislxVc9o9zq2kICJ9gcXASFWtEpF/AK+q6l9aHDMHyFHV73f0vJYU3KWqbNpXzuvr9vLW\nxn2szS8BYEBaPN84vi8Xjc+mX894j6M0xhypjiYFt5uPIoE4EakD4oHdLl/PHCURYXjvJIb3TuKH\nZw5lT0kVb2/cz6tr9/Cntz/nvrc+Z9KgNGZOyObc0VnERbs3xsMYE3xuNx/9ELgTqALeVNXLWu2f\nA9wFFACbgJtUdVcb55kLzAU45phjJuzY0aGlRk2A5R2o5IUV+SxckceOokqSYyP55vhsZk08huG9\nk7wOzxhzGKHQfJQKPA9cAhwE/gksVNWnWxyTBpSrao2IXANcoqqnH+681nzkPVVlydZinvvvTl5b\nu5fahkbGH9OD2ZMHcO7oLKJsLidjQk4oJIVvAWer6nec11cCJ6nq9Yc4PgIoVtWUw53XkkJoKa6o\nZdGKPJ5ZupNthRVkJsdwxUn9mTXxGNISY7wOzxjj6GhScPMr3U7gJBGJF/9tK2cAG1seICJZLV5e\n0Hq/CX09E6L57qmDePvmqTwxJ4dhmUn84c1NTLr7HW58diWLPy+ksdG9JkpjTGC51tGsqktFZCGw\nAqgHVgLzReR2IFdVXwJ+ICIXOPuLgTluxWPc5fMJpx+byenHZvL5vjKeWbqTF1bm8/Lq3WSnxnFJ\nTj8umdjPxj4YE+Jc7Wh2gzUfhY/qugbeWL+Xf+Tu4qPNRURFCOeOzuLKSQMYf0wPG/dgTBB53qfg\nFksK4WlrQTlPLdnBwtw8ymrqGd03he+eOtA6po0JEksKJiSV19Tzwoo8Fny0na2FFWSlxDJn8gBm\nnXgMybE21bcxbrGkYEJaY6Py7mf7efTDrSzZWkxslI+zR/XmwvHZnDIk3RYNMibAQmVEszFt8vmE\nM0ZkcsaITNbll/Dssp28vHo3L67aTa+kGC6akM3VJw8kI8luazUmmKymYEJGTX0D72zcz/Mr8nnn\n031ER/r49sT+XDN1EJnJdteSMUfDmo9MWNtaUM6f393Ci6v8U3pfktOPa6YOIjvVJuMzpjMsKZgu\nYWdRJQ+9v5mFy/NQhQuP78t1pw1mUEai16EZE1YsKZguZffBKuZ/sJVnl+2krqGRc0dnccO0IYzI\nansNa2PMl1lSMF1SQVkNjy3eytOf7KCitoEzR/Ti+mlDGH9MavtvNqYbs6RgurSSyjr+8vF2Fny8\njYOVdUwenMYN04YweXCajZQ2pg2WFEy3UFFTz9+W7uTRD7eyv6yGsf16cMNpgzlzRCY+G+tgTLNQ\nmCXVGNclxETyvSmD+OCWadx54XEUV9Qw96nlfH3eYj7ZUuR1eMYEzMebC9lVXOn6dSwpmC4hNiqC\ny07sz7s/Po0/XjyWg5V1zHp0Cdc8lcv2wgqvwzPmqKgqsxcs4+ml7q86aUnBdCmRET6+OT6bt388\nlZ9+bTgffl7I9Hvf51cvr2d/WbXX4RnTKeU19dQ1KGkJ0a5fy5KC6ZJioyK4YdoQ3vvJaVw0Ppsn\nP9nBlN+9y53/3kBheY3X4RlzRIoragHomeD+tC+WFEyX1is5lrsvGsPbN0/l3NFZPL54G6f+9l3+\n8MZnVNTUex2eMR1S1JwU3J9J2JKC6RYGpCfwx4vH8dbNU5k+MpN5727m9Hve44WVeYTbHXim+zlg\nNQVj3DEoI5H7Zx3P89dNIjM5lpv+vpqLHvqYNXkHvQ7NmENqqilYn4IxLpnQvycvXn8yv5s5hp3F\nVcz480f8/MW1lFTWeR2aMV/RVFNItaRgjHt8PuHinH6885OpzJ40gL8t3cm0e97jH7m7aGy0JiUT\nOooraomO9JEQHeH6tVxNCiJyk4isF5F1IvKsiMS22h8jIn8Xkc0islREBrgZjzFtSY6N4pcXjOLl\nG09hQFo8tyxcw6xHl7DNxjeYEFFUUUtaQnRQpnBxLSmISF/gB0COqh4HRACXtjrsO8ABVR0C3Av8\n1q14jGnPqD4pLLx2Mnd/czQb9pRy9n0fMP+DLdQ3NHodmunmDlTUkhrvftMRuN98FAnEiUgkEA/s\nbrV/BvBX5/lC4Ayx2cyMh3w+4dKJx/DWzVM5dWgGv3n1Uy566GM+21vmdWimGyuqqCUtMcyTgqrm\nA38AdgJ7gBJVfbPVYX2BXc7x9UAJkNb6XCIyV0RyRSS3oKDArZCNaZaZHMujV07ggVnHs+tAFV9/\nYDF/fnez1RqMJ4q7Qk1BRFLx1wQGAn2ABBG5vDPnUtX5qpqjqjkZGRmBDNOYQxIRvj62D/+5aQrT\nR2by+zc+46KHPubzfVZrMMF1oKKWnkG48wjcbT46E9imqgWqWgcsAia3OiYf6AfgNDGlADa1pQkp\naYkx/Pmy8cz79vHsLK7kvAcW89Qn223QmwmKmvoGymrqgzJGAdxNCjuBk0Qk3uknOAPY2OqYl4DZ\nzvOZwDtqv2kmRJ0/pg//uXkqpwxJ5//9az0//udqqusavA7LdHEHKvxjZ4IxRgHc7VNYir/zeAWw\n1rnWfBG5XUQucA57HEgTkc3AzcCtbsVjTCCkJ8bw2JU53HTmMF5Ymc9FD30clDnuTfdVHMTRzOC/\nO8g1qvoL4BetNt/WYn818C03YzAm0Hw+4YdnDmV0djI/fG4VX5+3mHmzxnPK0HSvQzNd0BczpIZ5\nTcGYru70YzN5+funkJkUy5VPLOXxxdusn8EEXHGlJQVjwsaA9ASev34yZ47I5I5XNvDThWuoqbd+\nBhM4xc76H5YUjAkTiTGRPHz5BH5wxlAWLs/j0vlLbCEfEzDFFbWIQI9wH6dgTHfi8wk3Tx/GQ5eN\nZ+OeUi5/bGlzW7AxR6O4spYecVFE+IIz2YMlBWMC6JzRWTw++wS2FVbw7UeXNE95bExnFQdx4BpY\nUjAm4E4eks6jV+awtbCCyx5bysFKSwym84rKa0kLwoprTSwpGOOCKcMymH/FBDbvL+eKx5dRWm2L\n95jOOVBZS2oQ1mZuYknBGJecNrwXj1wxgY17SrnhmRU2mZ7pFH/zkdUUjOkSph3bizsvPI4PPy/k\nVy9vsHEM5og0NioHKuvoGcSagqsjmo0xcMkJx7C1oIJHPtjK4IwE5pw80OuQTJgora6joVGDWlOw\npGBMENxy9rFsLazg9lc20D8tgWnH9vI6JBMGioI87xFY85ExQRHhE+67ZBzH9k7m+39bweb9tiaD\naV/TWJdgzZAKlhSMCZqEmEgen5NDbFQE1z+zgqpamw7DHF6wZ0gFSwrGBFVWShz3XjKOz/eX84uX\n1nkdjglxwZ4hFSwpGBN0U4ZlcMNpQ/hHbh7PL8/zOhwTwiwpGNNN/OjMoZw4sCc/f3Gd9S+YQyqu\nqCU+OoLYqIigXdOSgjEeiIzwcf+s44mPtv4Fc2jBnvcILCkY45nM5FjuvWQcm/aVc/drrZcvN8aS\ngjHdzpRhGVx98kD++skO3t9U4HU4JsR0qaQgIsNFZFWLR6mI/KjVMaeJSEmLY2471PmM6apuOXs4\nQ3sl8tN/rrapts2XFFfU0jNIi+s0cS0pqOpnqjpOVccBE4BK4IU2Dv2w6ThVvd2teIwJVbFREdx3\n6TgOVNbysxfX2vxIplmXqim0cgawRVV3BOl6xoSVUX1SuHn6cF5du5dFK/K9DseEgKraBqrqGuiZ\n2DWTwqXAs4fYN0lEVovIayIyqq0DRGSuiOSKSG5BgbW7mq5p7pRBTBzQk1+8tJ49JVVeh2M8VlTh\nX+e7yzQfNRGRaOAC4J9t7F4B9FfVscADwIttnUNV56tqjqrmZGRkuBesMR6K8Al/+NZY6hoaueOV\nDV6HYzx2oMK/MFNXbD46B1ihqvta71DVUlUtd56/CkSJSHoQYjImJB2TFs+Npw/h1bV7ee+z/V6H\nYzzUVFNI64LNR7M4RNORiPQWEXGeT3TiKQpCTMaErO9NGcSg9AR+8dJ6qutsUFt31TxDaldqPhKR\nBGA6sKjFtmtF5Frn5UxgnYisBu4HLlW79cJ0czGREdw+4zh2FFXy8PtbvA7HeOSLGVKDt8AOuLzI\njqpWAGmttj3c4vk8YJ6bMRgTjk4Zms7Xx/bhwfe28I1xfRmQnuB1SCbIiitqifAJyXHBXQvNRjQb\nE6J+ft4IoiN83PbSehu70A0dqKwlNT4ap4U9aCwpGBOiMpNjuWn6MD7YVMA7n1qnc3dTVF4b1MV1\nmlhSMCaEXTmpPwPTE/jNqxupb2j0OhwTRF6MZgZLCsaEtKgIH7eecyxbCip47r+7vA7HBFFxpSUF\nY0wbzhqZycQBPbnvrU2UVdd5HY4JEqspGGPaJCL87LwRFJbX8sj7W70OxwRBXUMjByvrgj5wDSwp\nGBMWxvbrwYxxfXj0w602L1I30DSFelpicMcogCUFY8LGT84ajgJ/eGOT16EYlxWW+5NCujUfGWMO\npV/PeK6aPIBFK/PYtK/M63CMi4qtpmCM6Yhrpw4mLiqCee9s9joU4yKvJsMDSwrGhJXUhGiunDSA\nl9fsZvP+cq/DMS5paj6ywWvGmHZ979SBDC3dx97Lr4bkZPD5/D+vvx622AR6XUFReQ2RPiE5Niro\n17akYEyYSfvwHV55/AYmvr0IyspA1f/zscdgzBh47TWvQzRHqajcP0bB5wvuvEdgScGY8LJlC8yc\nSXRNNdGNrdZaqKuDykqYOdNqDGGuqKLGk05msKRgTHi55x7/H//DqauDe+8NTjzGFYXltaR70MkM\nlhSMCS9PP92xpPDUU8GJx7iiqKLGk05msKRgTHgp7+AdRx09zoSkovLa0G4+EpHBIhLjPD9NRH4g\nIj3cDc0Y8xWJiYE9zoScytp6KmsbPBmjAB2vKTwPNIjIEGA+0A/4m2tRGWPadvnlENXObYpRUXDF\nFcGJxwRcUfMUFyFcUwAaVbUeuBB4QFV/CmS5F5Yxpk0//nHHksJNNwUnHhNwRc1TXIR2TaFORGYB\ns4FXnG2H/Z8pIsNFZFWLR6mI/KjVMSIi94vIZhFZIyLjj7wIxnQjgwfDwoUQH/+V5FDri6AxLt6/\nf/BgjwI0R6u4eYqL0K4pXAVMAu5U1W0iMhA47O0NqvqZqo5T1XHABKASeKHVYecAQ53HXOChIwne\nmG7pnHNgzRqYO7d5RHNjUjL/OP4c7r777/79Jmx5OcUFdDApqOoGVf2Bqj4rIqlAkqr+9giucwaw\nRVV3tNo+A3hS/ZYAPUTEmqWMac/gwTBvHpSUQEMDvtIStv3ytzy+18f2wgqvozNHoalPIaSbj0Tk\nPRFJFpGewArgURH54xFc51Lg2Ta29wVaLjyb52xrff25IpIrIrkFBQVHcFljuo9rpg4iKkK4/53P\nvQ7FHIWi8hrioiKIj4705PodbT5KUdVS4Jv4v9mfCJzZkTeKSDRwAfDPzoUIqjpfVXNUNScjI6Oz\npzGmS+uVFMvlJ/bnxZX5bLPaQtgqqqj1rJYAHU8KkU6zzsV80dHcUecAK1R1Xxv78vHf3tok29lm\njOmEa6YOJjrSxwNWWwhbheXezXsEHU8KtwNv4O8X+K+IDAI6+r9uFm03HQG8BFzp3IV0ElCiqns6\neF5jTCsZSTFccZK/trC1wEY1h6Oi8lpPluFs0tGO5n+q6hhVvc55vVVVL2rvfSKSAEwHFrXYdq2I\nXOu8fBXYCmwGHgWuP8L4jTGtzJ3iry3Y6mzhyT9DaognBRHJFpEXRGS/83heRLLbe5+qVqhqmqqW\ntNj2sKo+7DxXVb1BVQer6mhVze18UYwx4K8tXDlpAC+ustpCuFFVT+c9go43Hy3A39TTx3m87Gwz\nxoSguVMGOX0LVlsIJ6VV9dQ3qmdjFKDjSSFDVReoar3z+AtgtwEZE6LSE/21hX+tymeL1RbCRqEz\nmjk9DGoKRSJyuYhEOI/LgSI3AzPGHJ25UwYRExnBA2/bnUjhwuuBa9DxpHA1/ttR9wJ7gJnAHJdi\nMsYEQHpiDLMnD+Bfq3ezLr+k/TcYzzXNe9Qz1JuPVHWHql6gqhmq2ktVvwG0e/eRMcZb108bTGp8\nNLe/sgFV9Toc046meY/CofmoLTcHLApjjCuSY6O4afowlm0r5o31e70Ox7SjqfkoNT7EawqHIAGL\nwhjjmlkn9GNYZiK/efVTauobvA7HHEZRRQ0pcVFER3q3UvLRXNnqosaEgcgIHz8/byQ7iyv5y0fb\nvQ7HHIZ/jIJ3tQRoJymISJmzOE7rRxn+8QrGmDAwZVgG04ZnMO+dzRSW13gdjjmEwvIaz5bhbHLY\npKCqSaqa3MYjSVW9mdfVGNMpPztvJJV1Ddzz5iavQzGH4PUMqXB0zUfGmDAypFciV5zUn+f+u9Nu\nUQ1RReXeznsElhSM6VZuOnMYqfHR/PKl9XaLaoipb2jkQGUdaaHcfGSM6VpS4qO45WvDyd1xgBdX\n2dIloaS4smmMgtUUjDFBdHFOP8Zkp3DXq59SXlPvdTjG8cUUF1ZTMMYEkc8n/OqCUewvq7F5kUJI\nU1LwcooLsKRgTLd0/DGpzJyQzRMfbbNZVENEUfMMqZYUjDEe+J+zjyU2MsI6nUNEc/ORdTQbY7yQ\nkRTDj6YP48PPC3lj/T6vw+n2iipqiPAJKXFRnsZhScGYbmz2pP4Mz0zijlc2UFVr8yJ5qai8lp4J\n0fh83k4r52pSEJEeIrJQRD4VkY0iMqnV/tNEpEREVjmP29yMxxjzZZERPm6fMYr8g1U8+J4t3eml\nwvJaT5fhbOL2VBV/Al5X1ZkiEg3Et3HMh6p6vstxGGMO4cRBacwY14dH3t/KN8dnMzA9weuQuqWi\nihpP11Fo4lpNQURSgCnA4wCqWquqB926njGm8/7v3BFER/r41cvW6eyFovIathVWkJHUhZMCMBAo\nABaIyEoReUxE2voKMklEVovIayIyqq0TichcEckVkdyCggIXQzame8pMjuVHZw7lvc8K+M8G63QO\nppr6Bq55ajlVtQ3MmTzA63BcTQqRwHjgIVU9HqgAbm11zAqgv6qOBR4AXmzrRKo6X1VzVDUnIyPD\nxZCN6b5mTx7AsMxEfvHSekqr67wOp1tQVf530VpydxzgnovHMrZfD69DcjUp5AF5qrrUeb0Qf5Jo\npqqlqlruPH8ViBKRdBdjMsYcQlSEj7svGsO+0mrufGWj1+F0Cw++t4VFK/K56cxhnD8mNJaocS0p\nqOpeYJeIDHc2nQFsaHmMiPQWEXGeT3TiKXIrJmPM4Y0/JpW5Uwbz99xdvPvZfq/D6dJeX7eH37/x\nGReM7cMPzhjidTjN3B6ncCPwjIisAcYBvxGRa0XkWmf/TGCdiKwG7gcuVevlMsZTN00fyrDMRG59\nfg0lldaM5Ja7XvuUUX2S+d3MMTjfjUOCq0lBVVc5fQFjVPUbqnpAVR9W1Yed/fNUdZSqjlXVk1T1\nYzfjMca0LyYygj98ayyF5bXc/sqG9t9gjtiBilp2FFXy9bF9iI2K8DqcL7ERzcaYrxiT3YPrTxvM\n8yvyeMvuRgq4tc7Kd2P6pngcyVdZUjDGtOnG04dybO8kbnl+DfkHq7wOp0tpSgqjLCkYY8JFdKSP\nBy8bT119I9c9vZzqOpsbKVDW5B1kYHqC55PftcWSgjHmkAZlJHLPxWNZk1fCL19a73U4XcbavBKO\nC8FaAlhSMMa046xRvblh2mCe++8unlu20+twwl5heQ27S6pDsj8BLCkYYzrg5unDOXVoOrf9az2r\nd9kUZkejqT9hdLYlBWNMmIrwCfdfejwZSTHc/I9V1NY3eh1S2FqbV4IIjOqT7HUobbKkYIzpkNSE\naG6fMYotBRX89ePtXocTttbklTAoPYGk2NDrZAZLCsaYI3DGiEymDc/gT29/zv7Saq/DCUtr8w8y\nJtv7ie8OxZKCMeaI3Pb1UdTWN3L3a596HUrY2V9azb7SGkaHaCczWFIwxhyhgekJfPfUgSxamU/u\n9mKvwwkrzSOZQ7STGSwpGGM64funDyErJZbb/rWehkabw7Kj1uSV4BMYGaKdzGBJwRjTCfHRkfzs\nvBFs2FPKk59s9zqcsLE2v4ShvZKIj470OpRDsqRgjOmU80ZnMW14Bnf+eyPvb7JlctujqqwJ4ZHM\nTSwpGGM6RUS4f9bxDMtM4rqnl7POaS83bdtbWk1heU1I9yeAJQVjzFFIio1iwVUnkBofzZwF/2VX\ncaXXIYWsNXmhPZK5iSUFY8xRyUyO5a9Xn0BdQyOzFyzjQEWt1yGFpLV5JUT4hJFZodvJDJYUjDEB\nMKRXEo/NziHvQBXfezLXptluw9r8EoZlJoXcSmutWVIwxgTECQN68seLx5K74wC3LFyDLbf+ZTuK\nKhickeB1GO2ypGCMCZjzx/Thp18bzkurd3PvfzZ5HU7IUFX2lFTTp0ec16G0y9WkICI9RGShiHwq\nIhtFZFKr/SIi94vIZhFZIyLj3YzHGOO+608bzMU52dz/zmaeX57ndTghobiilpr6RrJSYr0OpV1u\nj6D4E/C6qs4UkWggvtX+c4ChzuNE4CHnpzEmTIkIv/7GaHYVV3HrojX06RHHpMFpXoflqT0l/skD\ns1K6cU1BRFKAKcDjAKpaq6qtV+eYATypfkuAHiKS5VZMxpjgiI708fDlE+iflsDcp3LZtK/M65A8\ntftgFQB9eoR+TcHN5qOBQAGwQERWishjItK6l6UvsKvF6zxn25eIyFwRyRWR3IICGzlpTDhIiY9i\nwZwTiI2KYM4Ty9hb0n2n2raagl8kMB54SFWPByqAWztzIlWdr6o5qpqTkZERyBiNMS7q1zOeBXNO\noKSqjjkLllFWXed1SJ7YXVJFdISPtIRor0Npl5tJIQ/IU9WlzuuF+JNES/lAvxavs51txpgu4ri+\nKTx0+QQ27y/nuqdXdMulPPccrKZ3Siw+n3gdSrtcSwqquhfYJSLDnU1nABtaHfYScKVzF9JJQImq\n7nErJmOMN6YMy+Cub45m8eZCbv7Hqm433fbekuqwuPMI3L/76EbgGefOo63AVSJyLYCqPgy8CpwL\nbAYqgatcjscY45Fv5fSjuKKWu177lNioCH530Ziw+OYcCLtLqjhhQE+vw+gQV5OCqq4CclptfrjF\nfgVucDMGY0zouGbqYKrqGrjvrc+JjfJxx4zjEOnaiaGxUdlXajUFY4xp0w/PGEp1XSMPv7+FmMgI\nfn7eiC6dGArLa6hrUEsKxhjTFhHhf84eTnVdA48v3kZCdAQ3nzW8/TeGqd1hdDsqWFIwxnhARPjF\n10dSWVvP/e9sJiU+mu+cMtDrsFyxxxm4lhUGA9fAkoIxxiMiwl3fHENZdT13vLKBHnFRXDQh2+uw\nAq6pptAnTGoKNkuqMcYzET7hvkvHccqQdG55fg1vrt/rdUgBt+dgFbFRPnrER3kdSodYUjDGeCom\nMoJHrpjA6L4pfP/Zlby9cZ/XIQXUnpJq+qTEhU1nuiUFY4znEmIi+ctVJzA8M4nvPZnLYx9u7TKL\n9OwuqQqb/gSwpGCMCRE94qP5+zUncdbI3vz63xv5vxfWUdcQ/lNi7DlYHTZ3HoElBWNMCImPjuTB\ny8Zz/WmDeXbZTuYsWEZJVfhOolff0Mj+smr6hMkYBbCkYIwJMT6fcMvZx3LPt8aybFsxlz22hAMV\ntV6H1Sn7ympoVMgKg2U4m14zjf0AAA3eSURBVFhSMMaEpIsmZDP/ihw27Svn0vlLKCir8TqkI9Y8\nRsFqCsYYc/SmHduLBXNOYGdxJZfM/yTsFuppHqNgNQVjjAmMk4ek8+R3JrK/tIaLH/mEHUUVXofU\nYVZTMMYYF5wwoCdPf/dESqvruPDBj1m+44DXIXXInpJqEmMiSYoNj4FrYEnBGBMmxvXrwaLrJpMU\nG8msR5fw7zWhvx7XnpKqsKolgCUFY0wYGZSRyKLrJjO6bwo3/G0FD7+/JaQHue0pqQ6rO4/AkoIx\nJsykJcbwzHdP5LwxWdz92qf86uUNNIbo8p67D4bXGAWwWVKNMWEoNiqCBy49nl5JMSz4aDulVXX8\nduYYoiJC53tuTX0DheU1YTWaGSwpGGPClM8n3Hb+SFLjo/njfzZRWl3HvG+PJzYqwuvQANhX4h9X\nEU7zHoHLzUcisl1E1orIKhHJbWP/aSJS4uxfJSK3uRmPMaZrERF+cMZQ7pgxirc/3c/sJ0JnWozd\nJf7bUcNlHYUmwagpTFPVwsPs/1BVzw9CHMaYLuqKSQNIjoviJ/9czYUPfsQTs09gQHqCpzHtKQmv\nFdeahE4DnDHGHIUZ4/ry9HdO5EBFLTP+/BGfbCnyNJ7dB8NrxbUmbicFBd4UkeUiMvcQx0wSkdUi\n8pqIjGrrABGZKyK5IpJbUFDgXrTGmLB24qA0XrzhZDKSYrji8aU8t2ynZ7HsKamiR3wUcdGh0cfR\nUW4nhVNUdTxwDnCDiExptX8F0F9VxwIPAC+2dRJVna+qOaqak5GR4W7Expiw1j8tgUXXT2bykHRu\nXbSWn7+4ltr64K/LsKu4KuzuPAKXk4Kq5js/9wMvABNb7S9V1XLn+atAlIikuxmTMabrS46N4onZ\nOVwzdRBPL9nJpUGeTC//YBUfbS5k8uC0oF0zUFxLCiKSICJJTc+Bs4B1rY7pLc7CpSIy0YnH24ZA\nY0yXEBnh43/PGcGfvz2eT/eWcf4Di1m6NTh/Xp5YvA0Frjp5QFCuF0hu1hQygcUishpYBvxbVV8X\nkWtF5FrnmJnAOueY+4FLNZTHrBtjws55Y7J48YaTSY6N5LLHlvLy6t2uXq+kqo7nlu3k/DFZZKfG\nu3otN7h2S6qqbgXGtrH94RbP5wHz3IrBGGMAhmUm8eL3T+Y7f/kvP3xuJVV1DVyc08+Vaz2zdAcV\ntQ3MnTLIlfO7zW5JNcZ0C8mxUfz16omcPCSdWxau4S8fbQv4NWrqG1jw0XZOHZrOqD4pAT9/MFhS\nMMZ0G/HRkTw2O4fpIzP55csb+PO7mwN6/n+t3E1BWU3Y1hLAkoIxppuJiYzgwcvGM2NcH37/xmfc\n99amgJy3sVGZ/+FWRmYlc8qQ8L2J0ibEM8Z0O1ERPv548TiiInzc99bnNDYqN00fhnMzZKe8+9l+\nNu8v575Lxh3VebxmScEY0y1F+ITfXTQGn8D972ymQZWfnDW8U3/QK2rqufu1T+nbI47zxmS5EG3w\nWFIwxnRbPp9w9zfHEOET/vzuFuoblVvPPvaIEoOqcuuitWwpKOfJq08MqTUdOsOSgjGmW/P5hDu/\nMZoIn/DI+1spq67njhnHEeHrWGL4y8fbeXn1bn76teGcMjR8+xKaWFIwxnR7Pp9wx4zjSIqN4qH3\ntnCwspZ7LxlHTOThJ7PL3V7Mnf/eyJkjMrlu6uAgResuSwrGGIN/wZ7/OftYesZHc+erGymtyuWR\nKyZQW9/IZ/vK2LSvjIOVdaQlRpORGENSbBQ/fG4lfVPjuOfisfg6WLMIdZYUjDGmhe9NGURqQjT/\n8/wacn79FlV1DYc8NjbKx1+vnkhKXFQQI3SXJQVjjGll5oRsMpJieH3dHgamJzC8dzLDM5PomRBN\ncUUtBWU1FJRX0z8tgcEZiV6HG1CWFIwxpg1Th2UwddhX12/pnRJL75RYIDynsWhPeN87ZYwxJqAs\nKRhjjGlmScEYY0wzSwrGGGOaWVIwxhjTzJKCMcaYZpYUjDHGNLOkYIwxppmoqtcxHBERKQB2tLEr\nBSjp5Oum500/04HCTobY+jpHckxb2zsSd8vnLbe5WQ43y9DyeXf/LLwuQ8vnofJZ2O9258rRX1W/\nOhqvNVXtEg9gfmdfNz1v8TM3UHEcyTFtbe9I3G2Vwe1yuFkG+yxCpwyh+FnY7/bRlaO9R1dqPnr5\nKF6/fIhjAhHHkRzT1vaOxN3yeSDK0JHzuFmGjly/I7rCZ+F1GToaQ3sCWQ773XZR2DUfBYOI5Kpq\njtdxHK2uUI6uUAboGuWwMoQON8vRlWoKgTTf6wACpCuUoyuUAbpGOawMocO1clhNwRhjTDOrKRhj\njGlmScEYY0yzLp8UROQJEdkvIus68d4JIrJWRDaLyP0iIi323Sgin4rIehH5XWCj/kocAS+DiPxS\nRPJFZJXzODfwkX8lFlc+C2f/j0VERSQ9cBG3GYcbn8UdIrLG+RzeFJE+gY/8K7G4UY7fO78Ta0Tk\nBRHpEfjIvxSHG2X4lvM73SgirnVIH03shzjfbBH53HnMbrH9sL83bXLrXtdQeQBTgPHAuk68dxlw\nEiDAa8A5zvZpwFtAjPO6VxiW4ZfAT8L9s3D29QPewD+oMT3cygAktzjmB8DD4fhZAGcBkc7z3wK/\nDcMyjACGA+8BOaEWuxPXgFbbegJbnZ+pzvPUw5XzcI8uX1NQ1Q+A4pbbRGSwiLwuIstF5EMRObb1\n+0QkC/8v6xL1/+s+CXzD2X0dcLeq1jjX2B+GZQg6F8txL3AL4PpdE26UQVVLWxyaQPiW401VrXcO\nXQJkh2EZNqrqZ27GfTSxH8LXgP+oarGqHgD+A5zd2d//Lp8UDmE+cKOqTgB+AjzYxjF9gbwWr/Oc\nbQDDgFNFZKmIvC8iJ7gabduOtgwA33eq+k+ISKp7oR7WUZVDRGYA+aq62u1AD+OoPwsRuVNEdgGX\nAbe5GOvhBOL/VJOr8X8zDbZAliHYOhJ7W/oCu1q8bipPp8oZ2cGLdhkikghMBv7Zonkt5ghPE4m/\nqnYScALwDxEZ5GRj1wWoDA8Bd+D/VnoHcA/+X+SgOdpyiEg88H/4my08EaDPAlX9GfAzEflf4PvA\nLwIWZAcEqhzOuX4G1APPBCa6Dl83YGUItsPFLiJXAT90tg0BXhWRWmCbql4Y6Fi6XVLAXzs6qKrj\nWm4UkQhgufPyJfx/NFtWf7OBfOd5HrDISQLLRKQR/wRVBW4G3sJRl0FV97V436PAK24GfAhHW47B\nwEBgtfOLlA2sEJGJqrrX5dibBOL/U0vPAK8S5KRAgMohInOA84EzgvUlqYVAfxbB1GbsAKq6AFgA\nICLvAXNUdXuLQ/KB01q8zsbf95BPZ8rpVkdKKD2AAbTo0AE+Br7lPBdg7CHe17qT5lxn+7XA7c7z\nYfirbhJmZchqccxNwHPh+Fm0OmY7Lnc0u/RZDG1xzI3AwnD8LICzgQ1ARjDid/P/Ey53NHc2dg7d\n0bwNfydzqvO8Z0fK2WZcwfrwvHoAzwJ7gDr83/C/g//b5evAauc/8W2HeG8OsA7YAszjixHg0cDT\nzr4VwOlhWIangLXAGvzfnrLcLINb5Wh1zHbcv/vIjc/ieWf7GvyTnvUNx88C2Iz/C9Iq5+HqXVQu\nleFC51w1wD7gjVCKnTaSgrP9aufffzNw1ZH83rR+2DQXxhhjmnXXu4+MMca0wZKCMcaYZpYUjDHG\nNLOkYIwxppklBWOMMc0sKZguQUTKg3y9x0RkZIDO1SD+GVLXicjL7c0uKiI9ROT6QFzbmNbsllTT\nJYhIuaomBvB8kfrF5G6uahm7iPwV2KSqdx7m+AHAK6p6XDDiM92L1RRMlyUiGSLyvIj813mc7Gyf\nKCKfiMhKEflYRIY72+eIyEsi8g7wtoicJiLvichC8a8T8EzTfPTO9hznebkzod1qEVkiIpnO9sHO\n67Ui8usO1mY+4YvJ/hJF5G0RWeGcY4ZzzN3AYKd28Xvn2J86ZVwjIr8K4D+j6WYsKZiu7E/Avap6\nAnAR8Jiz/VPgVFU9Hv+MpL9p8Z7xwExVneq8Ph74ETASGASc3MZ1EoAlqjoW+AD4Xovr/0lVR/Pl\n2Srb5MzRcwb+EeYA1cCFqjoe/xoe9zhJ6VZgi6qOU9WfishZwFBgIjAOmCAiU9q7njFt6Y4T4pnu\n40xgZItZJ5Od2ShTgL+KyFD8s8RGtXjPf1S15Tz3y1Q1D0BEVuGfr2Zxq+vU8sWEgsuB6c7zSXwx\nf/3fgD8cIs4459x9gY3458MH/3w1v3H+wDc6+zPbeP9ZzmOl8zoRf5L44BDXM+aQLCmYrswHnKSq\n1S03isg84F1VvdBpn3+vxe6KVueoafG8gbZ/Z+r0i865Qx1zOFWqOs6ZCvwN4AbgfvxrK2QAE1S1\nTkS2A7FtvF+Au1T1kSO8rjFfYc1Hpit7E/+sowCISNO0xCl8MYXwHBevvwR/sxXApe0drKqV+Jfj\n/LGIROKPc7+TEKYB/Z1Dy4CkFm99A7jaqQUhIn1FpFeAymC6GUsKpquIF5G8Fo+b8f+BzXE6Xzfg\nn/Ic4HfAXSKyEndryz8CbhaRNfgXRylp7w2quhL/bKmz8K+tkCMia4Er8feFoKpFwEfOLay/V9U3\n8TdPfeIcu5AvJw1jOsxuSTXGJU5zUJWqqohcCsxS1Rntvc8YL1mfgjHumQDMc+4YOkiQlzs1pjOs\npmCMMaaZ9SkYY4xpZknBGGNMM0sKxhhjmllSMMYY08ySgjHGmGb/HyRou9sexXjkAAAAAElFTkSu\nQmCC\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oU_FVP8v144R",
        "colab_type": "text"
      },
      "source": [
        "It is common to pick a point a bit before the suggested point."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Klji7mTCx5gq",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "max_lr = 5e-4"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hbR6ZLKQw0f7",
        "colab_type": "text"
      },
      "source": [
        "**DRUM ROLL PLEASE!!!!!** You are now going to finally start training your model! Specifically for 8 epochs because that was what was in the original code in the NLP course and it also happened to work the best during my training. However, you are also implementing a few call backs, namely automatically saving the best performing model, early stopping, and showing the training and validation loss graph. Since you are using early stopping, feel free to try out a higher epoch number and the training will stop once it starts not improving."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bn6B-VEUxW68",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def train_model(learn, epochs, model_name, max_lr = 5e-4):\n",
        "    \"\"\"Trains a model using save model, early stopping, and show graph call backs.\"\"\"\n",
        "    callback_fns = [\n",
        "        callbacks.SaveModelCallback(\n",
        "            learn, every='improvement',\n",
        "            monitor='valid_loss', name=f'{model_name}_save_model'\n",
        "        ),\n",
        "        callbacks.EarlyStoppingCallback(\n",
        "            learn, monitor='valid_loss', min_delta = 0.01,\n",
        "            patience = 3\n",
        "        ),\n",
        "        ShowGraph(learn)\n",
        "    ]\n",
        "    \n",
        "    learn.fit_one_cycle(epochs, max_lr, div_factor=5, callbacks = callback_fns)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6xWkTgfMxz45",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "epochs = 8\n",
        "model_name = 'comment_gen'"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vug9103IWzrk",
        "colab_type": "text"
      },
      "source": [
        "Training on Google Colab can take anywhere from ~20 to 60 minutes depending on the type of GPU they give you. So, relax, get an IV caffeine drip going, and let your model cook in peace :)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XjkA5AckgvaG",
        "colab_type": "code",
        "outputId": "ac276d38-5fc2-422c-d7c8-bd0dab3a808a",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 681
        }
      },
      "source": [
        "train_model(learn, epochs, model_name, max_lr = max_lr)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: left;\">\n",
              "      <th>epoch</th>\n",
              "      <th>train_loss</th>\n",
              "      <th>valid_loss</th>\n",
              "      <th>accuracy</th>\n",
              "      <th>bleu</th>\n",
              "      <th>time</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <td>0</td>\n",
              "      <td>1.182219</td>\n",
              "      <td>1.133453</td>\n",
              "      <td>0.828182</td>\n",
              "      <td>0.791774</td>\n",
              "      <td>06:46</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>1</td>\n",
              "      <td>0.920205</td>\n",
              "      <td>0.954264</td>\n",
              "      <td>0.841556</td>\n",
              "      <td>0.799681</td>\n",
              "      <td>06:47</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>2</td>\n",
              "      <td>0.812330</td>\n",
              "      <td>0.875513</td>\n",
              "      <td>0.849487</td>\n",
              "      <td>0.804000</td>\n",
              "      <td>06:44</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>3</td>\n",
              "      <td>0.752023</td>\n",
              "      <td>0.828835</td>\n",
              "      <td>0.853668</td>\n",
              "      <td>0.807183</td>\n",
              "      <td>06:45</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>4</td>\n",
              "      <td>0.679716</td>\n",
              "      <td>0.794862</td>\n",
              "      <td>0.856593</td>\n",
              "      <td>0.809325</td>\n",
              "      <td>06:43</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>5</td>\n",
              "      <td>0.653454</td>\n",
              "      <td>0.777795</td>\n",
              "      <td>0.859418</td>\n",
              "      <td>0.811010</td>\n",
              "      <td>06:42</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>6</td>\n",
              "      <td>0.611860</td>\n",
              "      <td>0.770059</td>\n",
              "      <td>0.860419</td>\n",
              "      <td>0.812164</td>\n",
              "      <td>06:49</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <td>7</td>\n",
              "      <td>0.605370</td>\n",
              "      <td>0.769881</td>\n",
              "      <td>0.860601</td>\n",
              "      <td>0.812119</td>\n",
              "      <td>06:45</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "Better model found at epoch 0 with valid_loss value: 1.133453130722046.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3de3Cd9X3n8ff3nOdcdb/ZkiXjGwkY\njHGwwpLAUBqaFEjIZQKBTtJmaXfcTdMFMruzQ6ezTTLTnUk723Y3TZqUNPSym0ATJwzZNGkuLZTJ\nhkBkMMZgAtjYkWRblmTdde7nt3+cR7Ysy9KRfI7Okfx5zWjOo+f6fXzkz3nO7/k9z2POOUREpHoF\nKl2AiIgsTEEtIlLlFNQiIlVOQS0iUuUU1CIiVc4rx0obmprdW7ZtLceqRUTWpH379g0559rmm1aW\noG7bsJGenp5yrFpEZE0ys2MXmlaepg/1zRYRKZmyBLVyWkSkdMoT1OVYqYjIJaosbdQ6ohaRpchk\nMvT19ZFMJitdStlFo1G6uroIhUJFL1OeoNYxtYgsQV9fH3V1dWzevBkzq3Q5ZeOcY3h4mL6+PrZs\n2VL0cmqjFpGKSyaTtLS0rOmQBjAzWlpalvzNQW3UIlIV1npIz1jOfpbpiFpRLSJSKrqEXEQueaOj\no/zVX/3Vkpe74447GB0dLUNF51IbtYhc8i4U1NlsdsHlvve979HY2Fiuss4oU68PEZHV46GHHuLw\n4cPs2rWLUChENBqlqamJV199lddee40PfvCD9Pb2kkwmeeCBB9izZw8Amzdvpqenh8nJSW6//XZu\nuukmfvrTn9LZ2ckTTzxBLBYrSX1l6ketqBaR5fns/32ZV46Pl3SdV22o59N3Xn3B6Z/73Oc4ePAg\n+/fv56mnnuK9730vBw8ePNOF7pFHHqG5uZlEIsHb3/52PvzhD9PS0nLOOl5//XUeffRRvvKVr/CR\nj3yEb33rW3zsYx8rSf06ohYRmeP6668/p5/z5z//eR5//HEAent7ef31188L6i1btrBr1y4Adu/e\nzdGjR0tWT1FBbWafAv4DhQx+CbjPOXfBjoA6oBaR5VroyHel1NTUnBl+6qmn+PGPf8wzzzxDPB7n\nlltumbcfdCQSOTMcDAZJJBIlq2fRk4lm1gncD3Q753YAQeDehZZR04eIrCZ1dXVMTEzMO21sbIym\npibi8TivvvoqP/vZz1a4uuKbPjwgZmYZIA4cX2jmvHJaRFaRlpYWbrzxRnbs2EEsFmP9+vVnpt12\n2218+ctfZvv27VxxxRXccMMNK16fFXP0a2YPAP8dSAA/dM59dJ559gB7AGo7tu6eOH64xKWKyFp1\n6NAhtm/fXukyVsx8+2tm+5xz3fPNX0zTRxPwAWALsAGoMbPzTmU65x52znU757qDXvF3hRIRkYUV\nc8HLrwFvOucGnXMZ4NvAOxdaIK82ahGRkikmqH8J3GBmcSvcTeRW4NBCC6iNWkSkdBYNaufcs8Be\n4HkKXfMCwMMLLZNTUouIlExRvT6cc58GPl3sSvPOkcnlCQV1zycRkYtVtiQdmUqXa9UiIpeUsgX1\nsIJaRNaw2tpaAI4fP85dd9017zy33HILPT09F72tsgX1yfG1/5BKEZENGzawd+/esm6jbEHdP1K6\n69xFRMrtoYce4otf/OKZ3z/zmc/wx3/8x9x6661cd911XHPNNTzxxBPnLXf06FF27NgBQCKR4N57\n72X79u186EMfKtn9Pspy9zwD+kcV1CKyDN9/CE6+VNp1tl8Dt39uwVnuueceHnzwQT75yU8C8I1v\nfIMf/OAH3H///dTX1zM0NMQNN9zA+9///gs+9/BLX/oS8XicQ4cOceDAAa677rqSlF+WoA4FAxxX\nUIvIKvK2t72NU6dOcfz4cQYHB2lqaqK9vZ1PfepTPP300wQCAfr7+xkYGKC9vX3edTz99NPcf//9\nAOzcuZOdO3eWpLayBbWaPkRkWRY58i2nu+++m71793Ly5Enuuecevva1rzE4OMi+ffsIhUJs3rx5\n3luclltZ2qjDXoDekelyrFpEpGzuueceHnvsMfbu3cvdd9/N2NgY69atIxQK8eSTT3Ls2LEFl7/5\n5pv5+te/DsDBgwc5cOBASeoqyxF12AswMJ4imckRDQXLsQkRkZK7+uqrmZiYoLOzk46ODj760Y9y\n5513cs0119Dd3c2VV1654PKf+MQnuO+++9i+fTvbt29n9+7dJamrPEEdNLJA30iCy9fVlmMTIiJl\n8dJLZ09ktra28swzz8w73+TkJFB4wO3BgwcBiMViPPbYYyWvqWxNHwC9p9X8ISJyscoU1IXmjmPD\nU+VYvYjIJaUsQe0FjLqIx5EhBbWIFOdSedbqcvazbFcmbl1Xy+HByXKtXkTWkGg0yvDw8JoPa+cc\nw8PDRKPRJS1XlpOJANtaa3jmyHC5Vi8ia0hXVxd9fX0MDg5WupSyi0ajdHV1LWmZ8gX1ulq+/UI/\nU6ksNZGybUZE1oBQKMSWLVsqXUbVKubhtleY2f5ZP+Nm9uBiy21trQHgTbVTi4hclGIexfUL59wu\n59wuYDcwDTy+2HLb/P7TaqcWEbk4Sz2ZeCtw2Dm38HWUwKaWOAGDw6cU1CIiF2OpQX0v8Oh8E8xs\nj5n1mFnP4OAgES9IV1NcXfRERC5S0UFtZmHg/cA355vunHvYOdftnOtua2sDYHNrDUd10YuIyEVZ\nyhH17cDzzrmBYhfY0hLn6ND0mu8bKSJSTksJ6t/gAs0eF7K5tYbJVJahST3oVkRkuYoKajOrAd4N\nfHspK9/sd9FT84eIyPIVFdTOuSnnXItzbmwpK9/Sor7UIiIXq2z3+gDoaorhBYyjCmoRkWUra1B7\nwQAbm+Nq+hARuQhlDWqAy5rj/FIPEBARWbYVCepjw+qiJyKyXGUP6k0tcSaSWcYSmXJvSkRkTSp7\nUG9sjgOo+UNEZJlWpOkDFNQiIsu1YkF9bFhBLSKyHGUP6pqIR2ttmF4dUYuILEvZgxoK7dRq+hAR\nWZ4VCepNfhc9ERFZuhUJ6sua45wYS5DO5ldicyIia8qKNX3kHRwfTazE5kRE1pSVafrw76J3TO3U\nIiJLtmJNH6C+1CIiy7EiQb2uLkLYC6iLnojIMhT7hJdGM9trZq+a2SEze8eSNhKwwl301PNDRGTJ\nvCLn+1/APzvn7vKfRh5f6oYua46rjVpEZBkWPaI2swbgZuCrAM65tHNudKkbuqw5Tu9p3e5URGSp\nimn62AIMAn9rZi+Y2d/4D7s9h5ntMbMeM+sZHBw8byWXNceZTGUZmdbtTkVElqKYoPaA64AvOefe\nBkwBD82dyTn3sHOu2znX3dbWdt5Kzt6cSY/lEhFZimKCug/oc8496/++l0JwL8llLeqiJyKyHIsG\ntXPuJNBrZlf4o24FXlnqhjY2FYJaXfRERJam2F4f/wn4mt/j4whw31I3FAsHWVcX0c2ZRESWqKig\nds7tB7ovdmN6IrmIyNKtyJWJMy5riavpQ0RkiVY2qJvjnBhPksrmVnKzIiKr2ooHtXPQN6LbnYqI\nFGtFg3qTuuiJiCzZigb1xmZ10RMRWaoVDeq22gixUFBd9ERElmBFg9rM1EVPRGSJVjSoodD8oaYP\nEZHirXhQzxxR63anIiLFqUBQx5hO5xiaTK/0pkVEVqWVD2q/i17viJo/RESKUZGmD1AXPRGRYq14\nUHf5tzvVg25FRIqz4kEdDRVud6oueiIixVnxoAbd7lREZCkqEtSdTTH6R3VjJhGRYhQV1GZ21Mxe\nMrP9ZtZzsRvtaopxcixJNpe/2FWJiKx5xT6KC+BXnXNDpdhoZ2OcbN4xMJGiszFWilWKiKxZFWn6\n6GoqhHO/7kstIrKoYoPaAT80s31mtme+Gcxsj5n1mFnP4ODggivr9IO6Txe9iIgsqtigvsk5dx1w\nO/BJM7t57gzOuYedc93Oue62trYFVzbT3KEjahGRxRUV1M65fv/1FPA4cP3FbDQaCtJaG9EjuURE\nirBoUJtZjZnVzQwD7wEOXuyGu9RFT0SkKMX0+lgPPG5mM/N/3Tn3zxe74c6mGC/3j13sakRE1rxF\ng9o5dwS4ttQb7mqK8aOXB8jnHYGAlXr1IiJrRkW65wF0NcZI5/IMTaYqVYKIyKpQsaCe6aLXqxOK\nIiILqtwRtX+7U51QFBFZWOWOqBt10YuISDEqFtQ1EY+meEh9qUVEFlGxoIZC84eCWkRkYRUN6o3N\nMTV9iIgsouJH1P0jCZxzlSxDRKSqVTioY6SyeQYn1JdaRORCKtv04XfRU19qEZELq/gRNaiLnojI\nQireRg2o54eIyAIqGtSxcJDW2rCOqEVEFlDRoAbobIrTe1pH1CIiF1LxoN7YpL7UIiILqXhQdzXF\n6R9NkM+rL7WIyHyKDmozC5rZC2b23VIWsLE5RibnGJhIlnK1IiJrxlKOqB8ADpW6APX8EBFZWFFB\nbWZdwHuBvyl1ATN9qXtPq51aRGQ+xR5R/0/gvwL5C81gZnvMrMfMegYHB4su4Ox9qXVELSIyn0WD\n2szeB5xyzu1baD7n3MPOuW7nXHdbW1vRBURDQdbVRdTzQ0TkAoo5or4ReL+ZHQUeA95lZv+nlEVs\nbFZfahGRC1k0qJ1zf+Cc63LObQbuBf7VOfexUhaxqTnOseGpUq5SRGTNqHg/aoAtrTUcH0uSSOcq\nXYqISNVZUlA7555yzr2v1EVsaasB4KiOqkVEzlM1R9QARwYV1CIic1VZUE9WuBIRkepTFUEdD3ts\naIhyZEhH1CIic1VFUANsW1fLYR1Ri4icp3qCuq2Ww6cm9URyEZE5qiioa5hK5zilJ5KLiJyjaoJ6\na1stAIdPqflDRGS2qgnqbTNBrXZqEZFzVE1Qr6+PUBMOclh9qUVEzlE1QW1m6vkhIjKPqglqgK2t\nNbo6UURkjqoK6m1ttfSPJphOZytdiohI1aiuoF5XOKGoo2oRkbOqKqjfur4Q1K8NTFS4EhGR6lFV\nQb25pYawF+DVkwpqEZEZVRXUXjDAFevrOHRivNKliIhUjWIebhs1s+fM7EUze9nMPlvOgq5sr+PQ\nCR1Ri4jMKOaIOgW8yzl3LbALuM3MbihXQds76hmaTDGoe36IiADFPdzWOedmrkIJ+T9lu8Xd9o56\nAA72j5VrEyIiq0pRbdRmFjSz/cAp4EfOuWfnmWePmfWYWc/g4OCyC7p2YwPBgLHv2Miy1yEispYU\nFdTOuZxzbhfQBVxvZjvmmedh51y3c667ra1t2QXFwx5XddTTc+z0stchIrKWLPUp5KPAk8Bt5Smn\nYPemJl7sHSOby5dzMyIiq0IxvT7azKzRH44B7wZeLWdR125sIJHJ6U56IiIUd0TdATxpZgeAn1No\no/5uOYu6prMRgAN9o+XcjIjIquAtNoNz7gDwthWo5YytrTXURjwO9I1xd/fGldy0iEjVqaorE2cE\nAsaOznoOqIueiEh1BjXAzq5GDp0YJ53VCUURubRVbVBf09lAOpvXnfRE5JJXtUG9s6sBgP29OqEo\nIpe2qg3qy5rjrKuL8OybuvBFRC5tVRvUZsY7t7XwzOEhnCvbrUVERKpe1QY1wDu3tTI0mea1AT2Z\nXEQuXdUd1Je3APDTw0MVrkREpHKqOqi7muJc1hznp4eHK12KiEjFVHVQA7xzWws/OzJMLq92ahG5\nNFV9UL9jWwsTyay66YnIJavqg/qWK9YRChrff+lEpUsREamIqg/qhliIm9/SxvdeOqFueiJySar6\noAZ4784Ojo8leUHNHyJyCVoVQf1rV60n4gX4Zk9vpUsREVlxqyKo66Mh7rx2A9998QTJTK7S5YiI\nrKhiHsW10cyeNLNXzOxlM3tgJQqb6wO7NjCRyvLUL05VYvMiIhVTzBF1FvjPzrmrgBuAT5rZVeUt\n63zv2NpCa22EJ/YfX+lNi4hU1KJB7Zw74Zx73h+eAA4BneUubC4vGOB9Ozv4l1dPMZbIrPTmRUQq\nZklt1Ga2mcLzE5+dZ9oeM+sxs57BwcHSVDfHXbu7SGfzPPrcL8uyfhGRalR0UJtZLfAt4EHn3Pjc\n6c65h51z3c657ra2tlLWeMaOzgZuuryVR37yJqmsTiqKyKWhqKA2sxCFkP6ac+7b5S1pYf/xV7Zx\naiLF48/3V7IMEZEVU0yvDwO+Chxyzv15+Uta2I2Xt7Cjs56/fvqIbtQkIpeEYo6obwR+E3iXme33\nf+4oc10XZGZ84lcu582hKX748slKlSEismK8xWZwzv0EsBWopWi37Whnc0ucL//bYW7b0U7hoF9E\nZG1aFVcmzhUMGL/7K9t4sW+Mb6mtWkTWuFUZ1AAf6d7I7k1NfO77h5hIql+1iKxdqzaogwHj03de\nxdBkmi88+UalyxERKZtVG9QAO7sauWt3F3/9b0d48lXdA0RE1qZVHdQAf3jHdtrro9z3dz/nYP9Y\npcsRESm5VR/UTTVhvvJb3US8AP/+b39O/2ii0iWJiJTUqg9qgGu6Gvin+28ilcnxm199luMKaxFZ\nQ9ZEUANcvq6OP/vItRwbnubOv/wJx4anKl2SiEhJrJmgBnjP1e08/nvvZDyZ4Xf+voeTY8lKlyQi\nctHWVFBDoSfIlz66m76Rae78wk949shwpUsSEbkoay6oofAw3O/8/k3EQkHuefhn/MG3XyKby1e6\nLBGRZVmTQQ3w1vV1/NP9N/HbN27h0ed+ycf/9jlOT6UrXZaIyJKt2aAGqIuG+KM7r+JPP7yTnx8d\n4c6//AkvH1dfaxFZXdZ0UM/4yNs38s3ffQe5vOMDX/h/PPDYC+zd10cyo6fEiEj1M+dKf/P97u5u\n19PTU/L1XqzBiRR/8s+v8m/7X6Mjf4LJyDruuOEaLl/fyO3XtBPxgpUuUUQuUWa2zznXPd+0Re9H\nvZa01UX4H3dfS/Kth4k+/t8AyD4T4BSNvPpEC8GGTrzGDUSaN7Kucws1rRuhfgPUbYBQtMLVi8il\natGgNrNHgPcBp5xzO8pfUvlFt90Ev/EYjPeTGOwlMNRLoP8I0dHXWTf6LPXHEvDCucvkIo1Q30mw\nYQPUd0B9J9R1FIK8fkNhONYEeoiBiJRYMUfUfwd8AfiH8paygmrXwRW3A1Dn/7QD2Vyek+NJXjo+\nQM9Lr/DG4deIJk7S5k7Tnh2hY/o0m06/SYc9T232NMacZiMv6od3ZyHMzxn2A712PQQvqS8yInKR\ninkU19Nmtrn8pVSeFwzQ1RSnq2kLN169BXgvE8kML/aOMTCe5ODpaf7ilQFePzWBy2VYxyjtdpp2\nO817NuZ5a2ycVnea+swgkd7nYOIElpvTJdACULPu3CPx+YbDNRX5NxCR6lPUyUQ/qL+7UNOHme0B\n9gBcdtllu48dO1aiEqvTqYkkzx8bYXAixfcPnuTF3lGm0md7kQQDRi6fp4kJOuw03S0proxPcEV8\ngg2BEZpzwzB+nODUCbz0+Hnrz0fqoW4Dgbr1EKkrBPeZn9rih4NhNceIrAILnUwsWVDPVq29Pspp\nKpVlcCLFm8NT/PzN05hBwIypVCG8jwxNcrB/nKHJ1HnLxkjSbiO022nWM0KHnWa9nabdRtjgTdDo\npYmTJEaCcC5BKL+Ee5gEvEUCfRnTvBgELomenSIrRr0+VkBNxKMm4rG5tYZfvWLdvPM45xgYT/Fi\n3yhvDk3R0RAlm3OEvQBhL0Dv6WkAxhMZjiQyjMTDfLN/jDeHp0hn85wYS5LLOwLkiZGiyUvT6KXp\nqsmzqQ421uRJJSbYXOeoC6SI5hNEXZKoS1AXSOFlpyE9RSSTIJw6AempWT+T4Iq9zN7mhHkNhOKF\no3cvAsEIeOE5r5ElTovMWp//6kXPDgc8fVOQS4aCegWZGe0NUdob2pe1fDqb59REkkMnJjgyOMnx\n0QTZvGNoMsV3+8fpP5ogHu5iOr34hTxtdRFSmRzpXJ7L19WCc4RJ41KF4J6eHKOGJJc3Gt0bwmyq\nddQFU7jUFF5ummBmiriliLkEwew02eQUwXyCWGCCEBksm4ZcCrL+Ty5deHWlusjI5g/3+QJ+Zh4v\nCsEQBEL+q1d4DYb9cd7ZacXOF/AK42fPF/DnnTufPlhkmYrpnvcocAvQamZ9wKedc18td2FyvrA3\nc7IzDqw/b3oykyMaCjKZyjKdypLO5UlmckylchwenCTvoDbi8capCV4+Pk5jPMTAeIp0Nk86m6d/\nNMvVGzoJeQE2t8Spi4b4l0MDfOe1MZKZ4m9q1RQPsbWtlvaWKKPTaQbGU1zZXkdd1KOrIUx7bYB6\nL0/QZQjk0gRdipGxSaYT06SSCeLBHB01AcJk6KwP0Bx2RCxLIDcT/rM+BGY+ALKp88fl0jA95f+e\n9JdLQz4Duaz/mim8roSZUC/mQyHggQUL4R4IFk5CW3DWsP8TCPrzBc6dNnuZhaYtZ33nLGOAFV4t\ncHb4nNfAPONY4vyz18/ytrmKXVJXJsryJDM5jo8mGE9mqQkHMTOy+TzjiSzjiQw552iMhUhkchwZ\nnOLQiXGODk9xZHCK4ak0V7bXMZnKkszk522jny0aCpDO5snP7fkYMDqbYqQyebygkcnlecu6OtbV\nRUjn8gyMJ3ltYJKwFyCZyVET9qiLeoSCAYIBoyEWoiEeIpXJkc07aiMe6+uj1EdDhILw1rYYW5rC\ndNR5xINuVqBnIJ8tvObSZ4fPCfv0ufPNLHfefBdYx7wfHllwrvANxOUhn5s1nC+8upw/Pj9n2uxl\nZubLz5kvV8JvN1IK9tnxizuZuFQKarmQ0el0IeCTGVLZHLk8TCQz7OxqpCEWIuwFmEplOTmeJJHO\n8drABIMTKU5Pp/nl8DSxUJDxZJaAwRunJknn8phBa22Ey5rjxMNBIl6QsUSGZCZHIpMjl3cMTaaZ\nTGWoCXvk8o6TY0kmUtl5a9zaWkNnUwwvYNRFQwxPFb51xMMeU6nsmXMKBmxqqSEaChL2AkS8AO31\nUdrqItREPJKZHGEvQH00RF3UI5PL0xgP0xALndmWc450Ls9EMktrbWSF3oVZnJvzIVBEuJ/zgeAA\nN+s1P884N/+4BefPzxnHAvMvtA5/H6uew669R0EtMptzjlS28B85l3e8cmKcvpFpfjmc4ODxMU6O\nJUllC81GjfEQES/AWCJDW12ETM6RzhaalXpHpsnmHDnnis6D1tqzYX1seJqs//VhXV2E5powyUyO\n6XSOSCjAzq5GOhtjtNSESWfzTKazTCaz1EVDbGyOARD1Ch8UoWDhQ+65N08znszQEAvRUhsmHvYY\nGE9iQDQUpK2u8IGQyuZJZfNEvAChYOFbx9a2WjoaoiQzOSJekLqoR94V/r3qYyFGptJk8o6pVJbj\nowma4mF2djVgan+/aOr1ITKHmRENnb0J19s3N/P2zc1LXk8mlydgRjKTI+8cfSMJBidS5J3DCxSa\nYYanCs09wUCAgfEkfSPTDIynMOCGrS3Ew0FqIyF6R6YZnU4TD3vEw0EmklkO9I3y41cGznyohL0A\n8XCQqVSWTG7+T4baiEdHQ5ShyRRjiQx5V1guEgyQzuXPrKtU6qMeHQ0x/8Oq8A2hqzFOLBxkU0sc\nL2AEzGiIh8jlHJOpLJ1NsTPnT5LZHDMX+U6nc/SNTBOPeCTTOeIRjw2NUd4YmMTMqI8VmqzCwQAB\nMxrjISaSGbxg4RtN2AsQ9YIEA0ZtxKM+FiKbzxMOBoiHPdK5PE3xENFQkHQ2T9grrKdw3YMjk8uT\nyzsa4iFqwh6j02lOT6VJZfMMTqSIh4N0NMRYVx8hGgoWvu1NpRlPZsjnCz24YuEgrbURJpNZhqdS\ntNVGz8w/H+ccubltfXMoqEUuQihY6E9eEyn8V9reEWJ7R2m34Vwh3KKh4JntJTM5xhKFk6BTqSzZ\nfOEoPxYO0tkYOycURqbS1MdCBAOFo95T40kiXhAvaMTDwcI3hFye4ckUR4enOT6aIB4uBNmk3zyU\nzTnGk4VvFBEvQE3Eoz4a4sRYgv29o5waT+EFC4GXz8PxsQQj02mePTJM3kF+1jcYL2BnvkVA4eIw\nz68tGgqyvj5CNu/86xCyDIwn2dpWixcwRvrTDE2mFw22lTAT7sVqiodo8M/lAEwks2c+5LOLrEdN\nHyKyIqZSWbygEQoEGJ5KEw0VTvTGwwsfL2Zzebxg4LxxGf/ovC5aOFLO+R84iXSOnHNMJrOMJgpH\nuoGAMZ3KEgwY48ksqWyOUCBAMps7880kHDTCXoBMzjGWyDCdztIUDxeO4L0ALTVhJvwL206MJklk\ncrTWhmlvKJyUDgaMdDbPdDrHwHiS2ohHa12Yock0A2NJTo4nOT2VPtOc1BgL4YCAQW0kxIPvfqua\nPkSksma+dQBn2smLMTekZ8Z5QYiFC98cLtSssJo8uMA0XQcsIlLlFNQiIlVOQS0iUuUU1CIiVU5B\nLSJS5RTUIiJVTkEtIlLlFNQiIlVOQS0iUuUU1CIiVa6ooDaz28zsF2b2hpk9VO6iRETkrEWD2syC\nwBeB24GrgN8ws6vKXZiIiBQUc0R9PfCGc+6Icy4NPAZ8oLxliYjIjGLuntcJ9M76vQ/4d3NnMrM9\nwB7/15SZHbz48iqqFRiqdBElsBb2Q/tQPdbCflTrPmy60ISS3ebUOfcw8DCAmfVc6L6qq8Va2AdY\nG/uhfagea2E/VuM+FNP00Q9snPV7lz9ORERWQDFB/XPgLWa2xczCwL3Ad8pbloiIzFi06cM5lzWz\n3wd+AASBR5xzLy+y2MOlKK7C1sI+wNrYD+1D9VgL+7Hq9qEsz0wUEZHS0ZWJIiJVTkEtIlLlShrU\nq+lSczM7amYvmdl+M+vxxzWb2Y/M7HX/tckfb2b2eX+/DpjZdRWs+xEzOzW7n/py6jazj/vzv25m\nH6+CffiMmfX778d+M7tj1rQ/8PfhF2b267PGV/Tvzcw2mtmTZvaKmb1sZg/441fN+7HAPqyq98PM\nomb2nJm96O/HZ/3xW8zsWb+mf/Q7RGBmEf/3N/zpmxfbv4pyzpXkh8KJxsPAViAMvAhcVar1l/oH\nOAq0zhn3p8BD/vBDwJ/4w3cA3wcMuAF4toJ13wxcBxxcbt1AM3DEf23yh5sqvA+fAf7LPPNe5f8t\nRYAt/t9YsBr+3oAO4Dp/uOhCkVIAAAMqSURBVA54za931bwfC+zDqno//H/TWn84BDzr/xt/A7jX\nH/9l4BP+8O8BX/aH7wX+caH9W8m/q/l+SnlEvRYuNf8A8Pf+8N8DH5w1/h9cwc+ARjPrqESBzrmn\ngdNzRi+17l8HfuScO+2cGwF+BNxW/uoLLrAPF/IB4DHnXMo59ybwBoW/tYr/vTnnTjjnnveHJ4BD\nFK7kXTXvxwL7cCFV+X74/6aT/q8h/8cB7wL2+uPnvhcz79Fe4FYzMy68fxVVyqCe71Lzhd7wSnPA\nD81snxUufwdY75w74Q+fBNb7w9W+b0utu1r35/f9JoFHZpoLWCX74H91fhuFI7lV+X7M2QdYZe+H\nmQXNbD9wisKH3WFg1DmXnaemM/X608eAFqpgP+ZzKZ9MvMk5dx2FuwJ+0sxunj3RFb4Hrbq+i6u1\nbuBLwDZgF3AC+LPKllM8M6sFvgU86Jwbnz1ttbwf8+zDqns/nHM559wuCldPXw9cWeGSSqaUQb2q\nLjV3zvX7r6eAxym8sQMzTRr+6yl/9mrft6XWXXX745wb8P+j5YGvcPbrZlXvg5mFKATc15xz3/ZH\nr6r3Y759WK3vB4BzbhR4EngHhealmQv7Ztd0pl5/egMwTBXtx2ylDOpVc6m5mdWYWd3MMPAe4CCF\nemfOuH8ceMIf/g7wW/5Z+xuAsVlfbavBUuv+AfAeM2vyv9K+xx9XMXPa/D9E4f2Awj7c65+l3wK8\nBXiOKvh789s0vwoccs79+axJq+b9uNA+rLb3w8zazKzRH44B76bQ3v4kcJc/29z3YuY9ugv4V//b\nz4X2r7JKeWaSwlnt1yi0Df1hJc+SLlLnVgpndl8EXp6plUIb1b8ArwM/Bprd2TPKX/T36yWgu4K1\nP0rhq2iGQvvZ7yynbuC3KZwoeQO4rwr24X/7NR6g8J+lY9b8f+jvwy+A26vl7w24iUKzxgFgv/9z\nx2p6PxbYh1X1fgA7gRf8eg8Cf+SP30ohaN8AvglE/PFR//c3/OlbF9u/Sv7oEnIRkSp3KZ9MFBFZ\nFRTUIiJVTkEtIlLlFNQiIlVOQS0iUuUU1CIiVU5BLSJS5f4/Sljlo1+EreoAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "Better model found at epoch 1 with valid_loss value: 0.9542644023895264.\n",
            "Better model found at epoch 2 with valid_loss value: 0.8755126595497131.\n",
            "Better model found at epoch 3 with valid_loss value: 0.8288350701332092.\n",
            "Better model found at epoch 4 with valid_loss value: 0.7948615550994873.\n",
            "Better model found at epoch 5 with valid_loss value: 0.7777946591377258.\n",
            "Better model found at epoch 6 with valid_loss value: 0.7700592279434204.\n",
            "Better model found at epoch 7 with valid_loss value: 0.7698812484741211.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "L3EoBzOp0jXx",
        "colab_type": "text"
      },
      "source": [
        "# Evaluate your model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cWZJjVnJ3q5-",
        "colab_type": "text"
      },
      "source": [
        "Let us now evaluated your trained model on some of your validation set so see how well your model is generating comments."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nKc8vHEic1EV",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#collapse_show\n",
        "def get_predictions(learn, ds_type=DatasetType.Valid):\n",
        "    learn.model.eval()\n",
        "    inputs, targets, outputs = [],[],[]\n",
        "    with torch.no_grad():\n",
        "        for xb,yb in progress_bar(learn.dl(ds_type)):\n",
        "            out = learn.model(*xb)\n",
        "            for x,y,z in zip(xb[0],xb[1],out):\n",
        "                inputs.append(learn.data.train_ds.x.reconstruct(x.cpu()))\n",
        "                targets.append(learn.data.train_ds.y.reconstruct(y.cpu()))\n",
        "                outputs.append(learn.data.train_ds.y.reconstruct(z.cpu().argmax(1)))\n",
        "    return inputs, targets, outputs\n",
        "\n",
        "inputs, targets, outputs = get_predictions(learn)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VPuJ22MgYHjC",
        "colab_type": "code",
        "outputId": "bee38ed6-6c49-47fb-bb90-1b50df537957",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "#collapse_show\n",
        "def print_results(inputs, targets, outputs, method_spm, comment_spm, n = 10):\n",
        "    \"\"\"Just a little helper function for printing out the results from your model.\"\"\"\n",
        "    for i in range(n):\n",
        "        print(\"Input:\", \" \".join(decode_spec_tokens(method_spm.DecodePieces(str(inputs[i]).split(\" \")).split(\" \"))), \"\\n\")\n",
        "        print(\"Target:\", \" \".join(decode_spec_tokens(comment_spm.DecodePieces(str(targets[i]).split(\" \")).split(\" \"))), \"\\n\")\n",
        "        print(\"Predicted:\", \" \".join(decode_spec_tokens(comment_spm.DecodePieces(str(outputs[i]).split(\" \")).split(\" \"))), \"\\n\")\n",
        "        \n",
        "print_results(inputs, targets, outputs, method_spm, comment_spm)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Input: xxbos @doesservicerequest private void putrangeinternal(final filerange range, final filerangeoperationtype operationtype, final byte[] data, final long length, final String md5, final accesscondition accesscondition, final filerequestoptions options, final operationcontext opcontext) throws storageexception { executionengine.executewithretry(this.fileserviceclient, this, putrangeimpl(range, operationtype, data, length, md5, accesscondition, options, opcontext), options.getretrypolicyfactory(), opcontext); } xxeos \n",
            "\n",
            "Target: xxbos Used for both uploadrange and clearrange. xxeos \n",
            "\n",
            "Predicted: xxbos Put to creating thes( s( xxeos \n",
            "\n",
            "Input: xxbos public static byte[] encodesequence(byte[]... encodedvalues) { int length = 0; for (byte[] encodedvalue : encodedvalues) { length += encodedvalue.length; } byte[] lengthencoded = encodelength(length); bytearraydataoutput out = bytestreams.newdataoutput(1 + lengthencoded.length + length); out.write(sequence_tag); out.write(lengthencoded); for (byte[] entry : encodedvalues) { out.write(entry); } return out.tobytearray(); } xxeos \n",
            "\n",
            "Target: xxbos Encodes a sequence of encoded values. xxeos \n",
            "\n",
            "Predicted: xxbos Encodes a byte of bytes bytes into xxeos \n",
            "\n",
            "Input: xxbos @override public String dnsresolveex(string host) { stringbuilder result = new stringbuilder(); try { inetaddress[] list = inetaddress.getallbyname(host); for (inetaddress inetaddress : list) { result.append(inetaddress.gethostaddress()); result.append(\"; \"); } } catch (unknownhostexception e) { log.log(level.fine, \"DNS name not resolvable {0}.\", host); } return result.tostring(); } xxeos \n",
            "\n",
            "Target: xxbos *********************************************************************** dnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveexdnsresolveex xxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeosxxeos \n",
            "\n",
            "Predicted: xxbos xxmap 51 * namepp  =pxxeos \n",
            "\n",
            "Input: xxbos protected void removeallfromattributevalueset() { final collection<abstracthtml5sharedobject> sharedobjects = getsharedobjects(); boolean listenerinvoked = false; final collection<writelock> writelocks = lockandgetwritelocks(); try { getattributevalueset().clear(); setmodified(true); invokevaluechangelisteners(sharedobjects); listenerinvoked = true; } finally { for (final Lock lock : writelocks) { lock.unlock(); } } pushqueues(sharedobjects, listenerinvoked); } xxeos \n",
            "\n",
            "Target: xxbos clears all values from the value set. xxeos \n",
            "\n",
            "Predicted: xxbos s all attribute from the    xxeos \n",
            "\n",
            "Input: xxbos public void registercheckwithnotes(string checkid, String name, String script, long interval, @suppresswarnings(\"sameparametervalue\") String notes) { Check check = new Check(); check.setid(checkid); check.setname(name); check.setscript(script); check.setinterval(string.format(\" ⁇ ss\", interval)); check.setnotes(notes); registercheck(check); } xxeos \n",
            "\n",
            "Target: xxbos Registers a Health Check with the Agent. xxeos \n",
            "\n",
            "Predicted: xxbos Registers a xxupj ealth checkxxmaj  check a givenxxmaj  name xxeos \n",
            "\n",
            "Input: xxbos public void assertequalsignoringcase(@nullable Description description, @nullable String actual, @nullable String expected) { if (!areequalignoringcase(actual, expected)) { String format = \"expecting:< ⁇ s> to be equal to:< ⁇ s>, ignoring case considerations\"; throw failures.failure(description, new basicerrormessagefactory(format, actual, expected)); } } xxeos \n",
            "\n",
            "Target: xxbos Verifies that two s are equal, ignoring case considerations. xxeos \n",
            "\n",
            "Predicted: xxbos Assert that the stringsxx are equal, oring the... xxeos \n",
            "\n",
            "Input: xxbos protected cronschedulebuilder createcronschedulebuilder(string cronexpr) { int i = cronexpr.indexof(\"[\"); int j = cronexpr.indexof(\"]\"); timezone timezone = defaulttimezone; if (i > -1 && j > -1) { timezone = timezone.gettimezone(cronexpr.substring(i+1, j)); cronexpr = cronexpr.substring(0, i).trim(); } return cronschedulebuilder.cronschedule(cronexpr).intimezone(timezone); } xxeos \n",
            "\n",
            "Target: xxbos Allow timezone to be configured on a per-cron basis with [timezonename] appended to the cron format xxeos \n",
            "\n",
            "Predicted: xxbos Create to to create used to the mtimei-s of a0],]. to the givenath.xxeos \n",
            "\n",
            "Input: xxbos private <T> fakeencodeditem readnextitem(class<t> clazz) { fakeencodeditem item = data[dataposition]; if (item == null) { / / While Parcel will treat these as zeros, in tests, this is almost always an error. throw new unreliablebehaviorerror(\"reading uninitialized data at position \" + dataposition); } checkconsistentreadandincrementposition(clazz, item); return item; } xxeos \n",
            "\n",
            "Target: xxbos Reads a complete item in the byte buffer. xxeos \n",
            "\n",
            "Predicted: xxbos Read the   from the given array. xxeos \n",
            "\n",
            "Input: xxbos private void hidesuggestionsifnecessary(final @nonnull querytoken currentquery, final @nonnull tokensource source) { String queryts = currentquery.gettokenstring(); String currentts = source.getcurrenttokenstring(); if (!iswaitingforresults(currentquery) && queryts != null && queryts.equals(currentts)) { msuggestionsvisibilitymanager.displaysuggestions(false); } } xxeos \n",
            "\n",
            "Target: xxbos Hides the suggestions if there are no more incoming queries. xxeos \n",
            "\n",
            "Predicted: xxbos Check the givenion of of the is no more  . xxeos \n",
            "\n",
            "Input: xxbos public list<uirow> getvalues() throws efapsexception { list<uirow> ret = new arraylist<>(); if (isfiltered()) { for (final uirow row : this.values) { boolean filtered = false; for (final tablefilter filter : this.filters.values()) { filtered = filter.filterrow(row); if (filtered) { break; } } if (!filtered) { ret.add(row); } } } else { ret = this.values; } setsize(ret.size()); return ret; } xxeos \n",
            "\n",
            "Target: xxbos This is the getter method for the instance variable . xxeos \n",
            "\n",
            "Predicted: xxbos Returns method a first method for the row of. xxeos \n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "i3N0SNzK4Hpf",
        "colab_type": "text"
      },
      "source": [
        "This is great and all. However, you can see the text looks a bit off. Your model sort of starts generating some word and then switches half way through sometimes. This is because the way you are currently trying to generate tokens is using [Teacher Forcing](https://towardsdatascience.com/what-is-teacher-forcing-3da6217fed1c), which means you are giving the model the groundtruth for what it should have produced even if it did not. This is very helpful during training, however, it expects to have both the x and y of an input. In a real world setting, you aren't going to be given the y, obviously! \n",
        "\n",
        "Therefore, I found a hacky way of bypassing this need for the y so that it is no longer needed. This involves using an empty array that fakes being the y, but has only ones and is updated everytime the model makes a prediction and is then fed back into the model so that is knows what it has generated before.\n",
        "\n",
        "**Heads Up** The way I coded this is extremely inefficient and so running it will take a long time to generate predictions. Therefore I recommend only generating a few comments (I set it up to only do 10).\n",
        "\n",
        "**TODO For You:** Come up with a more efficient solution that performs similarly to the Teacher Forcing approach of the above code."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7S1KskGj5xV9",
        "colab_type": "text"
      },
      "source": [
        "P.S.\n",
        "\n",
        "For other [language learners](https://docs.fast.ai/text.learner.html#LanguageLearner.predict) provided by FastAI, you can simply use the `predict` function and pass some text and ask for the model to predict the next set of tokens. However, I have been unsuccessful in implementing this `predict` function for Sequence to Sequence models. So, another **TODO For You** is to see if you can implement a `predict` function for Sequence to Sequence models so that you can easily generate comments for methods that you just pass to the function!\n",
        "\n",
        "If you do figure out a way to do this, I would be extremely interested! So, feel free to leave a comment about it."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TF8HwJaGdP_X",
        "colab_type": "code",
        "outputId": "fee24585-9c36-480d-c3f3-84b3ae07605b",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "#collapse_show\n",
        "def get_preds(learn, db_tst, max_seq = 128, n = 10):\n",
        "    learn.model.eval()\n",
        "    inpts, trgts, preds = [], [], []\n",
        "    for i, (xb,yb) in enumerate(progress_bar(db_tst.dl(DatasetType.Train))):\n",
        "        if i >= n: break\n",
        "        res = torch.zeros(len(xb[0]), max_seq, device = torch.device('cuda')).long() + 1\n",
        "        for i in range(max_seq - 1):\n",
        "            outs = learn.model(xb[0], res)\n",
        "            for j, out in enumerate(outs):\n",
        "                res[j][i + 1] =  out.argmax(1)[i]\n",
        "        for x, y, z in zip(xb[0], yb, res):\n",
        "            inpts.append(str(learn.data.train_ds.x.reconstruct(x.cpu())))\n",
        "            trgts.append(str(db_tst.train_ds.y.reconstruct(y.cpu())))\n",
        "            preds.append(str(learn.data.train_ds.y.reconstruct(z.cpu())))\n",
        "    return inpts, trgts, preds\n",
        "\n",
        "inputs, targets, outputs = get_preds(learn, db_tst)\n",
        "print_results(inputs, targets, outputs, method_spm, comment_spm)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/html": [
              "\n",
              "    <div>\n",
              "        <style>\n",
              "            /* Turns off some styling */\n",
              "            progress {\n",
              "                /* gets rid of default border in Firefox and Opera. */\n",
              "                border: none;\n",
              "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
              "                background-size: auto;\n",
              "            }\n",
              "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
              "                background: #F44336;\n",
              "            }\n",
              "        </style>\n",
              "      <progress value='10' class='' max='149', style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
              "      6.71% [10/149 01:14<17:21]\n",
              "    </div>\n",
              "    "
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {
            "tags": []
          }
        },
        {
          "output_type": "stream",
          "text": [
            "Input: xxbos protected String parseunquotedstringcontent() { final int startndx = ndx; while (true) { final char c = input[ndx]; if (c <= ' ' || charutil.equalsone(c, UNQUOTED_DELIMETERS)) { final int currentndx = ndx; / / done skipwhitespaces(); return new String(input, startndx, currentndx - startndx); } ndx++; } } xxeos \n",
            "\n",
            "Target: xxbos Parses un-quoted string content. xxeos \n",
            "\n",
            "Predicted: xxbos Parses the text from the HTML text. xxeos \n",
            "\n",
            "Input: xxbos private static void checkfilecopy(final File srcfile, final File destfile) throws ioexception { checkexists(srcfile); checkisfile(srcfile); if (equals(srcfile, destfile)) { throw new ioexception(\"files '\" + srcfile + \"' and '\" + destfile + \"' are equal\"); } File destparent = destfile.getparentfile(); if (destparent != null && !destparent.exists()) { checkcreatedirectory(destparent); } } xxeos \n",
            "\n",
            "Target: xxbos Checks that file copy can occur. xxeos \n",
            "\n",
            "Predicted: xxbos Checks if the file is a file. xxeos \n",
            "\n",
            "Input: xxbos long analyze() { Arc a; Arc aa; if (pre.outs == null) { return flags.reg_uimpossible; } for (a = pre.outs; a != null; a = a.outchain) { for (aa = a.to.outs; aa != null; aa = aa.outchain) { if (aa.to == post) { return flags.reg_uemptymatch; } } } return 0; } xxeos \n",
            "\n",
            "Target: xxbos analyze - ascertain potentially-useful facts about an optimized NFA xxeos \n",
            "\n",
            "Predicted: xxbos Returns the Syoooo Syna Syna Syna Sa Sa Sa Sa Sa Sa Sa Sa syna Syna xxeos \n",
            "\n",
            "Input: xxbos @suppresswarnings(\"unchecked\") public REC next() { checkdirection(true); orecord record; / / ITERATE UNTIL THE NEXT GOOD RECORD while (hasnext()) { / / FOUND if (currentrecord != null) { try { return (REC) currentrecord; } finally { currentrecord = null; } } record = gettransactionentry(); if (record != null) return (REC) record; } return null; } xxeos \n",
            "\n",
            "Target: xxbos Return the element at the current position and move forward the cursor to the next position available. xxeos \n",
            "\n",
            "Predicted: xxbos Returns the next record in the queue. xxeos \n",
            "\n",
            "Input: xxbos public static void addtransitivematches(hollowreadstateengine stateengine, map<string, bitset> matches) { list<hollowschema> schemalist = hollowschemasorter.dependencyorderedschemalist(stateengine); collections.reverse(schemalist); for(hollowschema schema : schemalist) { bitset currentmatches = matches.get(schema.getname()); if(currentmatches != null) { addtransitivematches(stateengine, schema.getname(), matches); } } } xxeos \n",
            "\n",
            "Target: xxbos Augment the given selection by adding the references, and the <i>transitive< / i> references, of our selection. xxeos \n",
            "\n",
            "Predicted: xxbos Add a variable to the list of s. xxeos \n",
            "\n",
            "Input: xxbos protected void resolvenestedproperties(final beanproperty bp) { String name = bp.name; int dotndx; while ((dotndx = indexofdot(name)) != -1) { bp.last = false; bp.setname(name.substring(0, dotndx)); bp.updatebean(getindexproperty(bp)); name = name.substring(dotndx + 1); } bp.last = true; bp.setname(name); } xxeos \n",
            "\n",
            "Target: xxbos Resolves nested property name to the very last indexed property. If forced, <code>null< / code> or non-existing properties will be created. xxeos \n",
            "\n",
            "Predicted: xxbos Resolve the property name. xxeos \n",
            "\n",
            "Input: xxbos public xmlconfig declarenamespace(string prefix, String namespaceuri) { validate.notempty(prefix, \"prefix cannot be empty\"); validate.notempty(namespaceuri, \"namespace URI cannot be empty\"); map<string, String> updatednamespaces = new hashmap<string, string>(declarednamespaces); updatednamespaces.put(prefix, namespaceuri); return new xmlconfig(features, updatednamespaces, properties, validating, true, allowdoctypedeclaration, true); } xxeos \n",
            "\n",
            "Target: xxbos Declares a namespace and also sets } to <code>true< / code>. <p / > <p>note that you cannot use this to add namespaces for the matcher. This has to be done by providing a to the matcher instance.< / p> xxeos \n",
            "\n",
            "Predicted: xxbos Creates a new instance of the given name and the given name. xxeos \n",
            "\n",
            "Input: xxbos protected static int getshadowradius(drawable shadow, Drawable circle) { int radius = 0; if (shadow != null && circle != null) { Rect rect = new Rect(); radius = (circle.getintrinsicwidth() + (shadow.getpadding(rect) ? rect.left + rect.right : 0)) / 2; } return Math.max(1, radius); } xxeos \n",
            "\n",
            "Target: xxbos Calculates required radius of shadow. xxeos \n",
            "\n",
            "Predicted: xxbos Get the Syyyy Syyy Syyy Syna Syna Syna Syna Syna Sna Syna Syna Syna xxeos \n",
            "\n",
            "Input: xxbos void addadviceclinitmethod(final String name) { if (adviceclinits == null) { adviceclinits = new arraylist<>(); } adviceclinits.add(name); } xxeos \n",
            "\n",
            "Target: xxbos Saves used static initialization blocks (clinit) of advices. xxeos \n",
            "\n",
            "Predicted: xxbos Adds a Java class to the Saa Sa Sa Syyetch Bean. xxeos \n",
            "\n",
            "Input: xxbos public static String padleft(string s, int desiredlength, String padstring) { while (s.length() < desiredlength) { s = padstring + s; } return s; } xxeos \n",
            "\n",
            "Target: xxbos Pad the given string with padstring on the left up to the given length. xxeos \n",
            "\n",
            "Predicted: xxbos Compares two strings, and returns the first character in the string. xxeos \n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Sf6vxn6jbSqm",
        "colab_type": "text"
      },
      "source": [
        "Not too shabby if I do say so myself! It seems to actually be learning about what a comment is supposed to have in it for documenting what the method is doing. Of course there are a lot of tweaks you could do such as adding the ability to generate inline comments instead of just method level, using more data, using different sampling schemes for generating the comments such as [top-k or nucleus](https://towardsdatascience.com/how-to-sample-from-language-models-682bceb97277), and any other awesome things you could think of! And if you do feel free to leave a comment about your adventure.\n",
        "\n",
        "**Tip:**\n",
        "I have done a lot of fiddling to get this to work, however, most of my models ended up overfitting. The way I fixed this issue was just being more careful in how I clean the data and also increase the data size. I know this seems simple, but it is quite effective."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FhEBU05D6gL2",
        "colab_type": "text"
      },
      "source": [
        "# Conclusion\n",
        "In this tutorial, you created an Automatic Code Comment Generator! You learned about how to clean, explore, and process data and how to use the awesome Pytorch and FastAI to define and train the awesome Transformer architecture. The use of Deep Learning in the field of Software Engineering is what I am studying for my Ph.D., so I hope I have inspired you to think about some other ways you could use Deep Learning to help Software Engineering!\n",
        "\n",
        "I hope you enjoyed this tutorial and look out for future blog posts from me about all kinds of topics as I have announced a challenge for myself for learning new things this year!\n",
        "\n",
        "> twitter: https://twitter.com/ncooper57/status/1235408134904086529?s=20"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Kpr7_elSqWRU",
        "colab_type": "text"
      },
      "source": [
        "# CodeSearchNet citation\n",
        "```\n",
        "{% raw %}\n",
        "@article{husain_codesearchnet_2019,\n",
        "    title = {{CodeSearchNet} {Challenge}: {Evaluating} the {State} of {Semantic} {Code} {Search}},\n",
        "    shorttitle = {{CodeSearchNet} {Challenge}},\n",
        "    url = {http://arxiv.org/abs/1909.09436},\n",
        "    urldate = {2020-03-12},\n",
        "    journal = {arXiv:1909.09436 [cs, stat]},\n",
        "    author = {Husain, Hamel and Wu, Ho-Hsiang and Gazit, Tiferet and Allamanis, Miltiadis and Brockschmidt, Marc},\n",
        "    month = sep,\n",
        "    year = {2019},\n",
        "    note = {arXiv: 1909.09436},\n",
        "}\n",
        "{% endraw %}\n",
        "```"
      ]
    }
  ]
}